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PREFACE 


the post-graduate students in education at University Col- 

lege, Nottingham, and at some of the training colleges in the 
University College of Nottingham Delegacy for the Training of 
Teachers. 

It is the writer's experience that most works on statistics present 
considerable difficulties to the student of education who has not 
had practice in handling arithmetical quantities and mathematical 
formulae. The writer of the present work has tried to make it as 
simple as possible. To an increasing extent the published results 
of psychological researches and the more humble analyses of test 
scores and marks are set down in mathematical form, and often 
involve the calculation of correlation coefficients, the analysis of 
variance and other means of comparing metrical estimations of 
human abilities and traits. The future should see a great expan- 
sion of class-room research by teachers and this will often call for 
simple statistical methods. 

For the serious student who may wish to continue with this 
work in one or more of its branches, a bibliography which in- 
cludes more advanced books is provided. It is proposed to follow 
this introductory account with a more advanced treatment of 
factorial analysis and the analysis of variance. The writer has to 
thank Sir Cyril Burt for permission to quote from the laboratory 
notes used at University College, London. 


Te book is a summary of a short course of lectures given to 


W. L. S. 
December 1946 


PREFACE TO SECOND EDITION 


given the author the opportunity to make some additions to 

the book, It is not possible to cover a comprehensive field in 
the applications of statistical methods to psychology and education 
without mathematics, In case any reader is put off by the use of 
arithmetic and algebra in this book, the early part of it has been 
expanded with more examples and diagrams. The work has been 
so arranged that those who are not prepared to work through the 
later chapters and the mathematical appendices will, it is hoped, 
find sufficient simple material in the chapters on distribution and 
correlation to help them with their practical problems. 

One or two statistical techniques of small value, such as Spear- 
man's 'footrule', have been dropped from the text to leave mi 
space for examples. 

Inevitably in a book no larger than this, limitations of space will 
demand the omission of certain topics and breadth of treatment, 
There is the eyer-present danger that statistical methods and 
formulae may be applied to data without the necessary insight into 
the validities and limitations of the processes used. In the present 
edition further space has been devoted to sampling techniques 
and to the limitations of the various statistical devices which have 
been reviewed. It is hoped that enough guidance has been given, 
cither herein or by reference to more specialized works, to enable 
students to start exploring a wide and ever-expanding domain. 

The author would like to thank the late Professor Hamley of 
the University of London, whose untimely death is an incalculable 
loss to the cause of educational research, and Dr. W. D. Wall of 
Birmingham University for kindly reading the first edition. Miss 
D. Wood, Vice-Principal of Weymouth Training College, and Mr. 
N. C. Flower of Nottingham University have kindly contributed 


examples. W.L.S. 
The University 

Nottingham 

March 5th, 1949 


A SECOND edition of this work was soon called for, and this has 
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CHAPTER I 


INTRODUCTION 


THE NATURE OF MENTAL MEASUREMENT 


With numbers all men may contend, their-charming systems to defend. 
GOETHE 


E-may look at the science of statistics from two view- 
| Ñ / points. Firstly, it may be regarded as the process of col- 
lecting figures which represent such things as amounts of 
exports, price levels, temperatures and barometric pressures from 
day to day, examination marks and so on, for which some scale 
of measurement has been found in a world which becomes pro- 
gressively more metrical. Secondly, statistics is the study of the 
means of manipulating and arranging figures, applying mathe- 
matical processes and thereafter interpreting the results. 
Scientific workers try to use the most effective language for 
their particular purposes. Clear verbal description is a necessity 
of course, but the precise language of mathematics is also necessary 
both to describe and to manipulate the results of observations. 
* Scientists usually feel that they are on firm ground when they can 
provide a ‘measuring stick’ in order that they can give quantitative 
results at the end of their experiments and observations. It must 
be remembered that these results are completely dependent not 
only on the accuracy of the observations, but also on the size and 
accuracy of the ‘measuring stick’. There is nothing absolute about 
their findings; they are merely a matter of comparison with an 
agreed unit of a scale, which in itself is an arbitrary measurement 
accepted by a large number of workers as a convenient common 
standard. In the physical sciences where we begin with measure- 
ments of length, which lead to those of area, volume and mass, 
and the measurement of time, there are considerable difficulties in 
fixing standards. (We assume, for instance, that time has certain 
properties of length and direction, and may be thought to have 
some of the properties of a straightline. Great and bewildering 
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new discoveries were made by Einstein and others in the field 
of physics, when some of the elementary foregone conclusions 
concerning measurements of length and time were challenged.) 

In the study of the ‘properties’ of the human mind, the problem 
is much more difficult. The mind is not a thing to be measured 
and weighed as can the whole physical human body, or even its 
brain. When we talk about the factors of the mind, the abilities 
or intelligence of man, we hav&to be careful to avoid the pitfall of 
thinking of these as so many tangible quantities each capable: 
of measurement in terms of length, volume or force and so on. It 
is only fairly recently that the ‘faculty’ psychology (which was 
kept alive by educationists long after its natural term of years) 
has been properly buried. The mind must not be thought of in 
terms of a series of faculties, such as intelligence, memory or wit, 
and it would be unfortunate if we were to bury ‘faculties’ and to 
resurrect ‘factors’ in their place. 

The study of arithmetic should always be sustained by logical 
thought, but many people tend to accept figures and numbers 
uncritically. It has been said cynically that statistics are the worst 
form of falsehood. This ought not to be correct, but the position 
may always be safeguarded by a critical examination of the things 
or ideas which underlie them. A simple example of this will 
suffice. Some years ago some statistics were used in an unscrupu- 
lous endeavour to show that insulin therapy was useless in cases 
of diabetes. It appeared that more people had died each year 
from this disease since the introduction of insulin than before it 
had been discovered. Moreover, the figures were correct as 
they stood! A little thought will show that the figures had been 
used to sustain a false argument. Diagnosis of the complaint had 
improved and thus diabetes had later been given as a cause of 
death, whereas before, the condition was ascribed to heart failure, 
pneumonia or internal inflammation. Also, interesting as a cause 
of death may be, from the statistical point of view, what really 
matters is whether insulin has extended useful lives, perhaps until 
fairly advanced age, even though death eventually takes place, as 
it must for everybody from one cause or another. The conclusion 
is that insulin is useful. 
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Investigations in the physical sciences are on the whole easier 
than those on the measurement of human and social factors. In 
the physical sciences we are usually able to isolate the property 
which we wish to measure and to insulate it, so to speak, from dis- 
turbing external influences. Different physical properties do not 
usually cause mutual perturbations which worry the physicist. In 
any case he can allow for them accurately. He is not usually 
troubled about the barometric pressure of the room, the colom, 
the magnetic and electrical properties of a piece of metal when he 
is measuring its specific heat. Moreover, he is able to use units 
which can be measured in a linear way and about which there is 
universal agreement. i 

The matter is not so simple for the psychologist, educationist 
and even the biologist, for they find it difficult, or even impossible, 
to proceed from cause to effect. The quantities which we think 
we have isolated and measured today have changed by the 
morrow. When we believe that we have isolated a physical system 
in the living body or a ‘factor’ in the mind, the integration of 
function and the working unity of the whole have to be taken into 
account even when we hope that we are studying some specific 
small ‘part’. The twofold aspects of mental activity, the cognitive 
or intellectual and the orectic or striving and emotional have to 
be thought of as being distinct when we try to measure various 
manifestations of either of them. It does not need much experience 
and thought to see that there are enormous difficulties in isolating 
their factors, It is one of the triumphs of modern experimental 
psychology and. statistical analysis, that in a large measure we 
have been able to clear away misconceptions concerning the 
so-called “factors of thé mind" and to substitute ideas which are 
based on scientific principles. Although we cannot always resist 
the temptation to ‘reify’ certain well-marked aspects of mental 
activity, we must avoid the temptation to think of these aspects as 
concrete quantities even if we discover a scale by which they can 
be estimated on a quantitative basis. We shall meet this exceedingly 
important consideration again. 


* When he is dealing with the ultimate particles of matt ici 
that statistical SEERA have to be used. 7 En ae 
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All mathematical problems which try to provide information 
concerning the world external to the investigator can be thought 
of in three stages: 


(1) The collection of data, taking care that we have the proper 
‘measuring rod’ for the job in handfand that we know how to 
use it. : a 


(2) By mathematical processes, the manipulation of the figures 
of the data, and eventually the arrival at a numerical result. 


(3). The interpretation of the result in relation to the original 
data. We apply the result to give us, further information or to 
predict possible future happenings. 


At length we may go from generalizations to tentative ‘laws’. 
Unfortunately, the second step is the only one which has been 
stressed in schools in the past. Really, it is but a link in a more 
important and lengthy chain of reasoning. . 
To make this matter clear let us take as an example a problem 
from psychological research. Suppose we wish to find whether 
there is any general’measure of agreement (correlation) between 
ability in classical studies and general intelligence. In the first 
stage of our investigation we have to evolve a suitable examination 
in classics for each age group, which will ensure that everyone 
has a fair chance and that there are sufficient questions and 

examinees to avoid errors of sampling. The examination paper 
_ should be suitable for ready marking on a scale which is in 

keeping with certain statistical requirements. The measurement 
of intelligence is not such an easy matter. "Nevertheless, without 
enlarging on the considerable difficulties which beset a task which 
many people imagine to be relatively simple, we will assume thata 
set of marks in classics and a score in an intelligence test given to 
the same large number of pupils have been obtained. 

- The second stage is the mathematical process whereby à 
coefficient of correlation between the marks in classics and the 
scores in intelligence tests is obtained. 

The last stage is to ask whether this coefficient is significant, 
how many times larger is it than the probable error, what is the 
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meaning and value of this correlation, what relationship has it to 
other possible correlations, and to what conclusions and further 
investigations of educational significance if any, will it lead? 

Although we have used the term ‘yardstick’ loosely in dealing 
with mental characteristics, it must be noted that there is a great 
difference between mental measurements and those of tangible 
and physical quantities. For instance, a length of seven feet is 
equivalent to the sum of the lengths of seven separate feet, but a 
similar consideration does not apply to the type of numerical 
abstraction which is obtained in the measurement of human 
abilities or sensory discrimination. Mental measurements have 
to be made by indirect means and are further complicated by the 
fact that the very things which are measured are ill-defined and 
that psychologists may even differ as regards the definitions of the 
factors which it is proposed to measure. The measurement of so- 
called ‘general intelligence’ is a case in point. All psychological 
measurement involves sampling and it is necessary to take steps 
to ensure that the sample is fully representative of the group, and 
secondly that it is large enough to reduce errors of sampling to 
small proportions. Moreover, it is necessary to know what are 
the possible errors which may mar an estimate made with samples 
of particular sizes. In addition to errors which are due to sampling 
there are other difficulties. We must know the degree of validity 
of a test as a measure of a particular characteristic. It has been . 
claimed that tests have been evolved which are a ‘measure of 
pure intelligence’. On investigation, it is found that such tests are 
loaded (or saturated) with a general cognitive factor to little more 
than 70% of their whole variance. Again, a test should have self- 
consistency or reliability. If it is divided into two parts by taking 
the odd and even numbered questions separately, there should 
be a high degree of agreement between the results scored in each 
half of the test. Although consistency in a test is essential to its 
validity it is not, of course, sufficient to determine the latter. We 
shall deal with these matters in a later chapter. 

Finally, in educational measurement there is always the possi- 
bility of irrelevant factors disturbing the estimation of particular 
characteristics. Hitherto, most mental measurements have dealt 
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with the cognitive or intellective factors of mental activity, and 
it is difficult to separate these from conative or emotional disturb- 
ing elements. Even the simplest individual is a rich and complex 
integration of mind and body which is fluctuating from day to 
day, or even from moment to moment. The physicists brass 
weight is not sensibly different today from what it was yesterday, 
but the human body-mind can never be the same, and it may have 
changed considerably. It is one of the triumphs of modern statis- 
tical analysis that we are able to carry along the disturbing factors 
in an investigation, allow for them,'and to a large measure 
eliminate their influence. 

Nevertheless, it must be emphasized that statistical investiga- 
tions are fraught with the possibilities of error. What is true for 
very large numbers of cases, when they are dealt with as a whole, 
is not necessarily true for smaller samples and still less for indi- 
viduals. Thus, the problems of sampling and the estimation of 
errors are important in this work. As will be seen later the 
numerical result of an investigation will only be significant when 
itis a sufficiently large multiple of the errors Such as are inevitable 
in educational measurement. In the past the theory of measure- 
ment has been neglected in the elementary stages of the physical 
sciences, but the student of educational measurements must face 
the problem from the start. 


CHAPTER II 


DISTRIBUTIONS AND DISPERSIONS 
OF SCORES 


same age, we find that they are distributed in a definite way. 

We can imagine the boys lined up against a long wall starting 
with the smallest boy and making each successive boy slightly 
higher than the last, in going from left to right. The line joining 
the tops of their heads will be a curve with a shape which would 
be an elongation of the following: 


> I: we measure the heights of a large number of boys of the 


HEIGHT 


» 


CUMULATIVE NUMBER OF INDIVIDUALS 
Fig. 1. An ogive or cumulative frequency curve. The curve can also be drawn 
with the number of cases given vertically (ordinates) and the marks or other measures 
given horizontally (abscissae). P 
It is known as an Octve (because of a similar curve which 
appeared in classical architecture). We could obtain the same 
curve by picking a thousand ears of wheat from a field (or a large 
number of peapods of the same crop) and arranging each of them 
vertically in a horizontal row, starting with the smallest and 
finishing with the longest. In biology and psychology we can think 
of many measurements of a similar kind made on a large number 
of things of the same type, which would give an ogive if plotted 
in this way. We shall meet this curve again when we are dealing 
with percentiles. It is sometimes known as a cumulative frequency 
curve. Itis often more useful to find the frequency or the number 
of cases occurring in each range whether of height, weight, marks, 
intelligence, quotient, etc. An easy way is to plot a HISTOGRAM. 
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Consider the following distribution of marks in which each step 
is one of 10 marks. 


Marks No. of Pupils 
0- 9 3 
10-19 12 
20-29 21 
30-39 4 28 
40-49 35 
50-59 37 
60-69 29 
70-79 17 
80-89 10 
90-99 5 


The height (and therefore the area) of each column gives a 
measure of the number of pupils whose marks lie between the 
figures at the foot of the column. The whole area of the rectangu- 
lar columns gives the total number of pupils. Here a word of 
warning is necé&sary, and it is wise to keep in mind the scales 
which are used for the marks along the horizontal axis and for the 
frequencies which are vertical measurements. The value of a unit 
area on the graph will serve as a guide. The histogram is some- 
times spoken of as a Column Diagram. 


Oo 10 20 30 40 SO 60 70 80 90 100 


Fig.2. Histogram. 


1 The frequencies may be grouped by counting the number of scores which fall 
in each interval and noting them by the tally method in fives: +H4- 
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Suppose we now consider the mid-points of the top of each 
column to be joined by straight lines and completed at each end 
by further straight lines joined to the horizontal line as shown in 
the diagram. We then have a Frequency Potycon. The fre- 
quency polygon does not give quite the exact picture of the data 
which is yielded by the histogram, especially when the number 
of cases is small, but frequency polygons may be superimposed 
and compared and this is a useful property. 


o 10 20 30 40 50 60 70 80 90 100 


Fig. 3. Frequency polygon. 


It will readily be appreciated that if we take a large number of 
cases which show distribution in a regular manner, the frequency 
polygon will take such a shape that it suggests a ‘smoothness’ 
which would tend to a curve if the intervals of marks became 
smaller as the numbers of cases became larger. 

We now come to a most important case of frequency distribu- 
tion. This is represented by the curve of normal distribution or what 
was formerly called the curve of error or the probability curve. 

Suppose we measure the heights of 10,000 adult Englishmen 
and plot a histogram showing the number in each half-inch range 
from (say) 58 inches to 77 inches, (It is possible that we may even 
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have to extend the range to include men smaller than 4 feet 
10 inches, and those taller than 6 feet 5 inches.) If we can now 
join the mid-points of the tops of the columns and then smooth 
the frequency polygon to make a curve we should get a shape 
like the following: 


g 


Fig. 4. 


This distribution is of the utmost importance in science. We shall 
refer to it as the curve of normal distribution. It used to be 
called the curve of error because it showed astronomers the 
distributions of the errors in their readings about the correct 
value, or again, in gunnery it gave the frequencies of the missiles 
in respect of their distances from the target after the range had 
been found.: The curve is also known as the probability curve 
for reasons which will be apparent in a later section of this book. 
If a curve is not symmetrical abeut a line drawn through its 
highest point it is said to be Skewzp and is known as a SKEW 
CURVE. 


1 For the properties of this important curve see Chapter V and the appendix. 
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Below is a positively skewed curve and the greatest frequency 
occurs before we come to the middle 'score': 


Fig. 5. Positively skewed curve. 


and this is a negatively skewed curve and the greatest frequency 
occurs after the middle score. 


Fig. 6. Negatively skewed curve, 
+ 
We shall see how the skewing of curves of examination marks and 
test scores affects the value of the investigation, when we come to 
apply these matters to the problems of marking. 


12 STATISTICS IN SCHOOL 


A curve like the following is known as a bimodal curve because it 
contains two humps, modes or most *popular' scores. We might 
obtain such a curve if we gave an intelligence test to a large 
number of children which consisted of two groups whose abilities 
were sharply divided. 


Fig. 7. A bimodal curve. 


It will be observed that the curve of normal distribution is 
symmetrical about a vertical line drawn through its highest point. 
If instead of the heights of a large number of Englishmen, the 
curve were made to represent the scores of a large number of 
children in an examination, this line would be a measure of the 
maximum number of children in any of the mark groups. In the 
case of the symmetrical curve we see that (a) the mark which was 
scored by the greatest number of children,was the average mark 
of 50%, (5) the middle child in an order of merit list scored the 
average mark. This is obvious as the area enclosed by the curve 
to the left of the central straight line is equal to the area enclosed 
by the curve to the right of this line. 

It will be noticed that in this and other curves there is a 
central tendency. The average value (score, mark, height, etc.) 
is called the Mean. The value of the middle case (e.g. the 
mark of the pupil who is half way down an order-of-merit list 
or rank) is called the MEDIAN. The score, mark, height, etc. 
which relates to the largest number of individuals is called the 
Mone. 

Example: The following is a list of marks obtained by school- 
children in a geography test. Find the mean or average. 
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Pupil vA 
A 45 
70 
21 
32 
5I 
68 
48 


alvozzEtWeeno"suosw 
e 


Divide 16)781 (48-8 
* 
Average 48-8% 


Add each column down and check by adding up: tick the 
column total when agreement is rcached. 


If the marks are represented by x 


The Mean M — x where = (sigma) is the sum of (the scores) 
and N is the number of pupils. 


An easier way of calculating an average (especially where there 
is no great spread of the measures) is to guess the mean and 
then adjust it by summing the differences of each measure from 
this mean and dividing by the number of measures, e.g. 

a 
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Find the mean of the following marks: 


Guessed Difference 
Pupil - Marks Average + E 
A 61 50 II 
B 40 50 10 
G 52 50 2 
D 37 50 13 
E 71 50 21 
F 47 50 3 
G 54 50 4 
H 32 50 18 
I 78 50 23 
J 45 50 5 
K 64 50 14 
L 38 50 12 
M 4I 50 9 
N 50 50 
o 46 50 4 
E P? 53 50 3 
78 74 
16 pupils. +4 


-. Mean = 50+ 34 = 50t 
This method may be expressed as follows: 


M=A+ 2 where A is the guessed or arbitrary mean and 


D is the sum of the differences (deviations) of each measure from 
this mean. 


Grouping Numbers 

In the first few pages for the sake of simplicity we have avoided 
certain difficulties. These must now be faced. Numbers which 
are arranged in series are either continuous or discrete. For instance, 
a scale of temperature, which is measured by a column of mercury 
in a thermometer is clearly continuous, but an order of merit in 
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which there is a number of places in a line clearly deals with dis- 
“crete numbers. The matter is of more than academic importance 
because scales of scores, ages, heights and other measures usually 
imply continuity. Thus we have to ask, for example, what is the 
precise meaning of a mark of 17, or an age of 11 years. A score 
of 17 may mean a value between 16.5 and 17.5 with 17 as the 
mean value or mid-point, or again it may mean a value between 
17 and 18 (or 17.99 ...). Kelley, Holzinger and the majority of 
statisticians would take the first meaning of a score, but it must 
be remembered that the second meaning will give results .5 ofa 
mark interval higher. When scores are to be grouped as fre- 
quencies (i.e. the number of scores which fall in each mark range) 
it is first necessary to know the complete range or interval between 
the highest and lowest scores. This range is then broken up into 
smaller intervals of which there will be a number depending on 
the range of scores and their nature. If the number of intervals 
is too large little labour will be saved in the grouping of the 
measures and if the number is too small errors will arise because 
of the coarseness of the grouping. 


A grouping which is given as follows: La 
O 5 
5— 10 
IO — I5 
15 — 20 etc. 


probably implies 


0— 4.99... 
5— 9:99... 
I0 — 14.99... 


I5 — 19.99... etc. 


9r if the integer is taken at the mid-point of a unit, the limits of 
intervals would be taken as 
Sb Edges 
45 — 9499... 
9-5 — 14.499}. . . 
1445 — 19.499. .. 
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In order to avoid such complexities it will usually be satisfactory 
to write the intervals like this: 


CIE. 
aur 
10 — 14 
157-19 
20 — 24 


In each case the mid-point of an interval — 
(upper limit — lower limit) 
2 


lower limit of interval ài 


Finding a mean from grouped data 


It is usual to present data which has been obtained from large 
numbers of cases in grouped frequency form. The following 
example shows how the mean is calculated: 


fieri Mid-point Frequency (f) x fx 


o= 4 2 2 —3 —6 
Sine. 7 4 Eee) 
10 — 14 12 6 —-1. —6 
15 — 19 17 10 o 0 — 20 
20 — 24 22 7 I 7 
25 — 29 27 6 2 .12 
30 —84 32 3 3 9 
85-39 37 I 4 4 
N = 39 > 4 32 Total = 12 


Thus. Correction to arbitrary mean (17) 
D P 60 
= — X interval wid =—= 
99 al width (5) pM 


True Mean = 17 + 1.54 = 18.54 
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This is the shortest and easiest method when there are many 
measures. The method is this: 

The first column gives the intervals of each group, the second 
column the mid-point of each interval and the third column the 
frequencies or numbers of measures falling in each interval. The 
fourth column gives the deviation of each interval (in interval 
units) from the arbitrary mean which is 17 in this case, The last 
column contains the products of the frequencies with their re- 
spective interval deviations. The numbers are added algebraically 
in the simple manner shown. The correction is made by dividing 
the sum of the A! column by the total number of measures, but as 
this comes out in interval units it must still be multiplied by the 
size of the interval (5) before it is applied to the arbitrary mean. 

The mean could have been obtained from the grouped fre- 
quencies by a longer method: 


(1) Add up the frequencies from each group to give the total 
number of measures 


(2) Multiply each frequency by the mid-point of its correspond- 
ing interval and add these products 


(3) Divide the sum of the products in (2) by the total number of 
measures in (1). There is considerable labour in this method if 
the number of measures is large. 


" Median 
The median is the mid-point in a distribution and the number 
of cases above it is equal to the number below it. It is easy to find 
the mid-point of a distribution which has an odd number of 
cases, e.g. 3. 4. 5. 5. 7. 8. 8. 9. 10 for clearly 7 is the value of the 
median which is the fifth case. 


xL N is the number of cases and is odd, the median is the 
c 


th case. In a distribution with an even number of cases, we 


must take the mean value of the scores just above and just below 
the centre point.* 


* Median is not quite the same thing as ‘mid-score’ as the median is strictly 
4 point and the mid-score will have a discrete value. 
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e.g. in 3. 4. 5. 5. 7.8. 8. 9 the median falls between 5 and 7 and 
can reasonably be given the value 6. 

From this we can extend our division of the distribution into 
quartiles and percentiles. In the following distribution: 

2. 2. 4. 5. 6. 7. 8. 9. 10. 10. 12. 13. T4. 14 16. 

it is easy to see that 5. 9. 13 respectively are the values which lie 
3, 4, 4 of the way along the distribution. 
Rm Lih 


The measure representing the first quartile Q, is the 


The measure representing the second quartile (median) is the 
NI 

= the. 

The measure representing the third quartile Qs is the 
3(N+ Da 


4 

When the number of measures increased by one is not exactly 
divisible by 4 the same formulae hold: in the case of a large number 
of cases it will usually suffice to give the value at each quartile 
point as that of the nearest case. When we have a smaller number 
of measures an estimate of the values can be made by simple 
interpolation. 

We may extend the division of the distribution into deciles 
(10 divisions) or more usefully into percentiles (100 divisions). 


‘The xth decile is the score which is E from the begin- 


ning or lower end of the distribution. 

The xth percentile may be regarded as the measure which is 
* QUE 9 pom the beginning or lower end of the distribution. 
This raises certain difficulties which increase if N is small, for 
clearly the rooth percentile cannot be the (N + 1)th place. The 
difficulty arises because percentile ranks correspond to points on a 
scale which must be presumed to be continuous whereas individual 
scores have a discrete value and a position, A percentile rank is 
sometimes spoken of as its level, but a clear distinction must be 
kept in mind between percentile ranks or levels and the score or 


įv 
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marks at these levels. When writers loosely use the word percentile 
it is sometimes necessary to examine the context to see what is 
implied. 

| In order to obtain an idea of percentile ranks let us con- 
sider a class of 25 arranged in order, starting with the lowest. 
Clearly each individual must correspond to 4 percentile divisions 
and the first individual will be given a.point in the middle of the 
first four divisions. Thus, the lowest will have a percentile rank 


00 — 96 


of 2 and the top score will have a percentile rank of : or 98. 


If 100 people in order have to be assigned percentile ranks it is 
necessary to distribute the divisions along a scale from.o to 100. 
'Thus it would be wrong to assign the lowest score to o and the 
highest to 99 or the lowest to 1 and the highest to 100. The rank 
of the lowest individual is the mid-point of the interval o—1 or 
.5 and the rank of the best score is the mid-point of 99 —100 or 99.5. 

A formula for giving percentile ranks is 100 — e 
where R is order of merit and N the total number of places. 

[Usually percentile ranks are measured from the lowest score 
whereas an order of merit must start with the best score. Hence 
the right hand term is subtracted from 100. Sometimes for con- 
venience percentiles are given from the highest score but observa- 
tion should show when this has been done.] 

In ordinary educational investigations concerned with the 
analysis and comparison of marks percentiles may be handled by 
plotting the ogival curves on graph paper." When a large number 
of scores are to be dealt with it will be possible to join the plotted 
points by a smooth curve instead of in short straight lines. Not 
only are percentile curves useful for comparing distributions at 
various points, but they give a valuable means of fixing norms 


! Instead of using ordinary graph paper and plotting the percentile or cumulative 
frequency curves in the form of ogives some workers use Otis's percentile charts or 
graph paper. Here the frequencies in a normal distribution produce a straight line 
instead of an ogive. This type of chart or graph paper is so ruled that the abscissae 
lines are in inverse proportion to the frequencies in a normal distribution. The 
slope of the line gives information concerning the numbers in the formula for the 


curve of the distribution. 
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which are measures of typical performance of certain groups (age 
groups, etc.). The norm may be given at the median or mean of 
the group but the quartile scores are easily seen and these will 
probably be invaluable. Moreover, skew and dispersion of the 
distribution may easily be calculated by the method given at the 
end of the chapter. 

If we know the marks at the 1st, roth, 25th, 5oth, 75th, goth 
and ggth percentiles, we have an excellent idea of the distribution 
and by plotting a graph we can find a score corresponding to a 
percentile, and a percentile (which gives us an idea of order of 
merit or rank in the distribution) corresponding to a given score. 

In a normal distribution a difference in percentile rank corre- 
sponds to a greater difference in scores at the beginnings and ends 
(the tails) of the distribution than in the middle. In fact as regards 
mark equivalents the st, 6th, 22nd, oth, 78th, 94th and ggth are 
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f SCORES IN TEST 
Fig. 8. Percentile scale for class of 70. Here the scores are plotted horizontally 
and the percentile levels and their equivalent cumulative frequencies are plotted 
vertically. A point on the graph will give the score at any percentile level, or the 
total number of people who have not reached a certain mark. 
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about equally spaced. We cannot therefore take the averages of - 
a pupil's percentile ranks in various subjects in the same way that 
we can combine his scores. 


Finding percentiles when data are given in tabulated form 

The results of examinations and tests are often given in tabulated 
form and sometimes the statistical treatment of sets of marks is 
easier if they are put into group frequencies. 

Consider the following scores in an intelligence test. They are 
given as the frequencies (the number of persons tested) falling 
into each score range of 5 marks: 


£ 

Test Scores Frequency Cumulative Frequency - 
135-139 o 9 
130-134 5 5 
125-129 8 13 
120-124 9 22 
115-119 12 34 
IIO-II4 18 52 
105-109 25 77 
100-104 18 95 
9599 20 115 
90- 94 13 128 
85- 89 6 134 
80- 84 7 I41 
75 79 2 143 


Total number N = 143. 

The majority of percentile levels will fall inside one of the 
classes or score ranges. In the above example with an awkward 
number such as N = 143 all of them will fall within a class. 

We can find the percentile (rank) corresponding to a given score 
from the following formula: 

P 
an ns 
where P = percentile, x, the value of the test score or other 
measure falling at this percentile level. V rm Wee 


+ 
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L is the lower limit of the class in which x, lies. 

S is the sum of all the frequencies (the number of persons 
tested) up to but not including this class. 

f the frequency within this class. 

N the total of all the frequencies. 

G:the size of this class. 


Percentiles also offer à useful way of comparing sets of marks no 
matter what are the scales of marking. 

It is obvious that there is some advantage in giving a student's 
score in terms of a percentile for then the middle of the rank 
would always be the 5oth percentile. "The unfamiliarity of this 
method to the layman or the uninitiated would probably lead to 
errors in its interpretation. Although percentiles give a ready 
means of comparing distributions they must not be used for 
combining them. Obviously percentile units are much closer to 
one another near the middle of a distribution than they are at 
each end. 

Example: Find the 77th percentile score if 

P=77 N13 L=u5 f=12 C=5 S-—109 


S 
Xo = 115 + 2 [13649 L4 == 109) 


ll 


115 +2 (110.1 T 109) 


v à (r) 


= 115-46 


Measures of Dispersion, Variability or Deviation 

We may summarize the uses of the various measures of central 
tendency as follows: 

1. Mean. This is used when each score or measure should have 
equal weight, when the most reliable measure of central tendency 
is required and when standard-deviations and product-moment 
correlation coefficients are required, 


DISTRIBUTIONS OF SCORES 23 


2. Median. This is useful when a quick and easily calculated 
measure of central tendency is required, when there are extreme 
measures which would weight the mean in a disproportionate 
manner, and when certain scores which are known in frequency 
but not as individual numerical values are included in parts of 
the distribution. " 


3. Mode. This gives the most often recurring score and yields a 
quick approximate measure of score concentration. 


The mean, median and mode are various ways of regarding the 
central tendency in a distribution but it is also necessary to have 
a measure of the spread or dispersion of the set of marks or other 
measures. In order to secure a proper arrangement of a number 
of pupils in order of merit it is obviously necessary that the marks 
should not be bunched together at any point but should be 
properly distributed. Again, when we come to consider the 
problems of error in estimating psychological 'factors' it is 
necessary to know how the errors are distributed. "These are 
two of the many instances of the use of methods of estimating 
dispersion in mental measurement. 

Interquartile Range * 

The quartile deviation is widely used. If the scores are arranged 
in rank or order of merit, the difference in score between the first 
and third quartile points is known as the interquartile range. We 
arrange the scores in order of merit, find the score which is a 
quarter of the way along the distribution and that which is 
three-quarters of the way along the distribution and subtract the 
scores. Dividing by two gives the quartile deviation Q, (or the 
semi-interquartile range): 


Q= 9-9 


It will be observed that Q is the score of the mid-point in the 
order of merit list. It is therefore the median score. 

Half the total number of scores lie between the first and third 
quartile points and thus the difference of the score values at these 


c 
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points (or more conveniently half this value) is a measure of the- 
spread or dispersion of the scores. [Later it will be seen that a 
similar method will be applied to derived measures such as 
deviations and errors and the term frobable error, which will be 
explained later on is often used. This corresponds to interquartile 
range which is usually applied to scores or primary measures.]' 


Mean Deviation or Average Deviation from the Mean. (Mean Variation) 


The deviations (differences) of the scores from the mean or 
average are all regarded as positive and added together. This 
sum is divided by the number of individuals or cases. 

zd 


Mean deviation M.D, = N 


The Mean Deviation is not often required in educational 
statistics. When a distribution is symmetrical it marks off about 
57.594 of the measures above and below the mean. 


20- 20 40 50 60 70 80 


‘Two distributions with the same number of cases and mean 
but with different standard deviations. 


Standard Deviation 

This measure of dispersion or spread is of great importance and 
is that which is usually of the most value for mathematical treat- 
ment and for the calculation of correlation coefficients. In finding 
the Mean Deviation above, we regarded each of the deviations 
as having a positive sign, which was not actually true. If each of 


1 The range from the roth to the goth percentiles called D by some writers is 
useful measure of dispersion. 
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the deviations is squared this difficulty is overcome. Moreover, 
the squaring of each deviation will tend to give due weight to 
any comparatively large deviation. It also remains to be said that 
the use of Standard Deviation is in keeping with the mathematical 
properties of the curve of normal distribution and the symbol for 
S.D. appears in the formula of the curve." 

To find standard deviation each deviation is calculated and 
squared. The column of squares is summed and this sum is 
divided by the number of cases and finally the square root is 
taken. S.D. is ‘root-mean-square’ and is usually represented by 
the small Greek letter sigma o 


TEES 
NN 


Sometimes when we are comparing sets of scores it is necessary 
to add a subscript to sigma, thus o. or c. to indicate to which 
group of marks the standard deviation refers. Readers who are 
not familiar with mathematical notation need not be worried 
about the sign = which is the large Greek sigma S and means 
“the sum of ~. 

Students should consider the following four methods of com- 
puting the standard deviation, and choose that which appears to 
be the easiest and most labour-saving in view of the given data. 

1. The direct method. The mean (or average) is found, the 
deviation of each score or mark from the mean is calculated, these 
are squared, added, divided by N and the square root is found. 

In all these methods of calculating the standard deviation a set 
of tables of squares and square roots such as Barlow's, logarithms 
and/or a simple slide-rule will be useful. It is hardly ever necessary 
to give the answer correct to more than two places of decimals and 
usually one will suffice.* 

1 In mathematical language Standard Deviationgg is the parameter of the 
equation of the normal distribution curve. z 

2 A word of warning ought to be given concerning the finding of square roots. 
A rough mental estimate will always give the clue to the particular square which is 
required and where the decimal point should be.placed. 

o square a number by logarithms, double the log of the number and find the 


antilog. To find the square root halve the logarithm of the number and then find 
the antilog. See the appendix for the use of the slide-rule for this and other 


purposes. 
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Example (for the sake of simplicity a very ‘short’ list of scores is 


taken): 
Mark D D: 
30 
8 8—6-— 2 4 Mean = x = 6 
7 2-61 I 
4 ETT. 4 
9 Oe OS, 9 
2 2—6=—4 16 
Total 30 34 


2. Usually the mean does not turn out to be a whole number 
and the squares of the deviations contain decimal fractions which 
cause considerable labour. In this case we guess a mean which îs 
4 whole number and then apply a correction. A quick mental 
calculation will suffice to supply the arbitrary mean.' 


1 This formula may be evolved as follows. 

Suppose the difference of the true and arbitrary means equal e. M — A = c. 

‘Thus if x, is a deviation from the assumed mean and x is a deviation from the 
true mean 


1 XQp—xdec 
Squaring x = xh 2xc + c? 
Summing for the whole set of scores 
Ex," = Ex* + 2c Ex + Ne? 
œ (as the c? will be the same for each score) 
Now Ex = O because it is the sum of the deviations about the actual mean, 
‘Thus Er? = Za? + Ne? 
: Ime Ext — Ne? 

Dividing through by N and substituting for o° 

3 Za Ex? 

N N 


E 


Note. The deviations have been expressed here in terms of x and x, to avoid 
confusion arising from the different uses of D as a deviation. 


a Sc pp 
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Mark D D D: 
10 10 — 6 4 16 
3 e Bg press 9 
N= 6) 7 7—6 I I 
8 8—6 2 4 
5 5-6 —1 I 
—6 —2 ZD? 
4 4 4 3D! _ 35 — 585 
35 
Guessed mean A = 6 
True mean M = S = 6-17 
Sos =D? 
The formula for S.D. in this case o = B (M — A)* 
wo = V/583 — (6-17 — 6)" = V/5:83 — -08 
= 4/58 = 2-41 


When there are only a few numbers to be considered and all 
the scores or marks are whole numbers, it will suffice to call the 
arbitrary ‘mean’ zero. Thus, the deviations (D) will be the 
original marks (x) and the formula then becomes 


Mark x x 

10 100 

3 9 = a 6-17" 
7 49 ESTNE d 
8. 64 - 43°83 — 38-03 
5 25, s = A58 

4 16 = 2:41 

‘ = 37 = 617 Zi: = 263 ^ 


4. The mean can be calculated at the same time as the standard 
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deviation by using a modification of the formula on page 27 which 
now becomes s 
ET 
N N 
which is obvious when we remember that 
True Mean = X + A (Arbitrary Mean) 


and D is the deviation from the guessed or arbitrary mean. 


vallum of the Standard Deviation when the measures are given in 
grouped frequencies 
Even with the use of tables, slide-rules and calculating machines 
there is considerable labour in calculating the S.D. of a large 
number of measures. This may often be simplified by putting 
them into frequency groups. Or it may happen that the measures 
are originally given in this form. 
The formula then becomes: 


in terms of the size of the interval (or extent of each group). 
If we wish to express the formula in the same units as the 
measure (i.e. os score form) the formula is 


FT Demy yn. 


where i is the size of the interval of (+) group. 


When a calculating fnachine is used the easiest form of this 
expression is + 
Li i RR 
a cm V NZfD: — (zfD): 
In each case all the scores in the interval are taken to have a 
value equal to that given by the mid-point of the interval. D is 
the deviation of each measure from an arbitrary mean and f the 


frequency, i.e. the number of measures in each class or interval. 
x 


rd 
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ample: In the following table the marks are given in the first 
columns, the mid-points of the intervals in the next and then the 
frequency in each interval. Find the S.D. 


Mid-point = 
Marks of Interval f D fD fDt 
91-100 95:5 I + 4 4 16 
81- 9o 85.5 2 +3 6 18 
71- 80 75:5 3 + 2 6 12 
61- 70 65:5 6 +1 6 6 
51- 60 55:5 II o o o 
41- 50 45:5 12 —1 — 12 12 
3I- 40 35:5 10 —2 — 20 40 
21- 30 — 2555 6 —3 — 18 54 
II- 20 15:5 3 —4 — 12 48 
I- IO 5:5 I —5 —5 25 
Ne Z/D = — 45  ZfD'—23 
E 
( N 55 7 

2 LI m —— —— 

S.D. = 10 MEL = (m) =, 104/420 = -67 = 104/ 3:53 
N N 
= ro:x 1-88 
L3 
— 18.8 


Sheppard's Correction for Grouped Data 

When the measures are grouped into a frequency distribution the S.D. calculated 
by the method above is slightly larger than it would have been had the measures been 
dealt with separately. It can easily be seen that when the deviations are squared, 
those that lie beyond the mid-point will add relatively more to the sum than those 
that lie on the “smaller” side and the matter is further complicated by the fact that 
each interval in the diagram has a trapezoidal shape. In the case of a normal 
distribution Sheppard has shown that in terms of interval units the o* should be 
diminished by 4&. Thus the corrected S.D. will be given by (4//c* — ty) X i where 
c is the crude S.D. found from the grouped frequencies, This is equivalent to 


corrected S.D. — (SR = Ge) - =) xi 


N 12 
b: f A Nay? — fb) — X 


* 
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As we shall see later when we are studying normal distribution 
the standard deviation is a most important measure of dispersion. 
For instance, if we assume normal distribution and know the 
value of the mean (which in this case will also be equal to median 
and mode values) we can calculate in terms of the standard 
deviation the y value (number of cases) for any x value (score or 
marks). If we assume, for instance, that intelligence quotient I.O 
is distributed normally and we know the standard deviation and 
can assume a mean of 100, we can at once calculate the percentage 
of population possessing particular intelligence quotients, or with 
I.Q.s between one level and another. This will be understood by 
a consideration of the properties of the curve dealt with in 
Chapter V. 

The uses of the various Measures of Dispersion (Spread, Varia- 
bility) may be summarized as follows: 

1. Q. Semi-Interquartile range. This is used to give a quick measure 
of variability by inspection, when there are scattered or extreme 
measures and when the degree of concentration round the median 
is necessary, 

2. M.D. Mean Deviation (Mean Variation, Average Deviation). This 
is of occasional value when extreme deviations should not be 
allowed to influence the measure of dispersion unduly and when 
it is desired to weight all deviations according to their size. 


3: S.D. Standard Deviation. c. This is the most reliable measure 
of dispersion, it leads to various other mathematical methods, is 
necessary when coefficients of correlation, measurements of relia- 
bility and variance are to be calculated. Extreme deviations give 
proportionately greater influence on this measure of dispersion. 

Assuming a normal distribution the following numerical 
relationships occur Q = .67450 

M.D. = .79790 
D = 2.56316 


Standardized and Normalized Scores 


If the scores in a test are represented as measures below or 
above their average, and they are then divided by their standard 
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deviation, they are represented by za, 22, etc. and are said to be 
standard (or z) scores. Approximately two-thirds of the scores will 
lie between 1 and — 1. Ifthe scores can be taken to be distributed 
normally each set of scores can be regarded as equivalent and 
comparable. Standard scores can be regarded as deviations from 
the mean which have been adjusted so that the standard deviation 
is unity. (It is possible that to call the average mark o and to 
make all marks below it negative, may have a bad psychological 
value, but in the statistical handling of scores it is often the most 
convenient way.) Sometimes the scores are normalized by dividing 
their differences from the mean by o4/N, that is, by the product 
of the standard deviation and the square root of the number of 
persons. Standardized scores can be converted to normalized 
scores by dividing by the root of the number of persons. In the 
case of normalized scores it will be seen that the sum of the scores is 
unity, and as we shall see later the sum of their products is the 
correlation coefficient. 

The variance of a set of scores is the square of the standard 
deviation. Where a set of scores has been standardized the 
variance will clearly be unity. We shall use this again when we 
meet factorial and variance analysis. 

It may be useful to return to the question of percentiles and to 
think of them in terms of standard scores. 


Assuming a normal distribution: » 
Percentile Standard Score. Deviation from 
Level Mark mean (50) 3s S.D. 10 
99 73 + 2:3 
90 63 ) + 13 
75 57 us d) 
50 50 M o 
25 43 xi 
10 37 Sd 
I 27 — 23 


The limits of the distribution are taken to be 4- 3 S.D. to 
— 9g S.D. à 
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(In the area under the normal curve (see Chapter V) only 
-135% of the measures lie outside this range.) 

For psychological reasons the mean might be taken as 60 
instead of 50, all the marks then being raised by 10. This does 
not affect the distribution. 

Intelligence tests differ with respect to both their mean and 
their standard deviation. Scores can only be compared by 
standardization. In the Moray House Tests the mean is taken as 
100 and the S.D. 15. 


Percentile Score Standard Score: 
99 (approx.) 135 + 2:30 
95 15 DES Jd 
9o 120 + 1:30 
B. re 115 + roc 
75 110 + +70 
50 100 o 
25 90 - +70 
16 85 — 1-00 
10 8o — 1:30 

5 75 Fl JO 
1 (approx.) 65 © — 2-30 


It will be observed that the scores with standard deviation from 
the mean fall at the 16th and 84th percentile levels. 

Sometimes it is necessary to convert these sigma or z scores to 
a scale with a*given mean and a given standard deviation. Such 
an operation would also obviate the necessity of using negative 
scores and those with decimal fractions. Such scores were called 
| scores by McCall in How to Measure in Education. All that is 
necessary is to multiply each z score with the given S.D. and add 
to or subtract from the given mean. 


1 Some writers do not differentiate between standard and standardized scores, 
but this need not cause the reader any confusion. A standard score really means à 
score given as a deviation from the mean with the standard deviation as unit, ie. 
deviation divided by standard deviation. Standardized scores mean those that 
have been adjusted to an agreed mean and standard deviation. Before such 
adjustment the scores are called raw scores, 
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Measure of Skewness 
If a distribution is symmetrical, its median, mode and mean 
are at the same point. If a distribution has a positive skew, that 
is, if it has a long tail stretching towards the high scores, its 
median will be less than its mean and its mode will usually lie 
between these. 
mean — mode 


Skewness Sk = standard deviation 
_M=—Mo 
PANICO. 

or Spes BOE MET 


c 
where Md is the median. 


[A less useful measure of skewness is given by 

(ier) 

N'g* 

where the x's are deviations from the mean, and N is the number 


of measures in the distribution.] i s 
The shape of a symmetrical distribution is measured by its 


kurtosis’ or flatness Ba 


Sk = pi = 


=x* 
Pi = Now : 
For normal distribution B+ = 3.] 
Mode Smee i 


For many curves and for moderate degrees of skewness C =}. 
Thus, to compute the mode from the mean and the median 
Mode = M 3(M — Md) 

= 3Md — 2M - 
(which could have been obtained by equating the first two 
expressions given above for Sk.) 


1 A sharply pointed curve is said to be leptokurtic, a flat curve platykurtic and a 
moderate curvature mesokurtic. 
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Skewness may be measured by using the scores at the roth, 50th 
and goth percentiles: 


Sk = (Poo + Pro) _ Ps 
2 
Similarly kurtosis may be measured by EO 
(Ps, v. Pro) 


Coefficient of Variability 

The relation which the standard deviation bears to the mean 
score is of interest as it gives a measure of the variability, which is 
independent of the units used. 


Thus the variability is M : 
If this is expressed as a percentage it is called the coefficient of 
variability. 
1000 
M 


This is quite independent of the measures used, whether they 
are marks or the weights of human beings. In general, if V is 
greater than 1 or 25% the dispersion is regarded as being rather 
large and the results should be used with great caution: The 
coefficient of variability (of variation or of relative variability) is 
not reliable if the true zero points of the sets of measures are not 
known, that is, if all the measures in a set are *padded out. The 
formula for kurtosis given above is a much more reliable measure 
of the shape of a distribution than the coefficient of variability, 
the general use of which is not recommended. 


V= 


* V is also used for Variance, and its two uses should not be confused. 


CHAPTER III 


CORRELATION AND REGRESSION 


the members of a class, we should feel justified in expecting that 

there may be some relation between them. We should hardly 
anticipate that the top boy in science would also be the top boy in 
mathematics and that all the boys would have the same orders in 
both subjects until we came to the unfortunate boy who was at 
the bottom of the list in science and also in mathematics. If this 
curious relationship between the mark lists in these subjects did 
exist, with its exact correspondence of one order to the other, we 
should say that the marks were perfectly correlated positively. If the 
orders of the marks in both subjects were reversed, the top boy in 
one subject was the bottom boy in the other, the second boy in 
the science list was the last but one in the mathematics list, and 
so on (this is unthinkable, of course!), we should say that here 
was a case of perfect negative correlation. If the marks in science bore 
no relation at all to those in mathematics we should say that there 
was no correlation. In practice we should expect to find some 
positive connection between marks in these two subjects, but it 
would be partial or imperfect correlation. This type of correlation 
is most important when we consider examination marks, and the 
scores in psychological and other tests; and exact mathematical 
methods for dealing with it are of the utmost importance in many 
educational and psychological researches. The correlation coeffi- 
cient is almost as important to the psychological tester as is the 
balance to the chemist. As we shall see in a later chapter, many 
extraordinary assertions were made by educationists and psycho- 
logists in the past, and continue even today, because statements 
concerning human abilities or ‘intelligence’ had not been subjected 
to rigorous analysis in which the use of correlation coefficients is 
invaluable. Nevertheless, other techniques are sometimes more 
valuable, but a clear idea of correlation is none the less of prime 


importance. 


[: we consider the marks in science and mathematics gained by 
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We can obtain a useful graphical idea of the degree of correla- 
tion between sets of numbers by plotting a scatter diagram or 
scattergram. Suppose we plot the scores in two subjects or tests of 
a number of individuals, giving a point on a two-dimensional 
graph to each individual. The co-ordinates of each point (x.y.) 
are measures of the scores in each subject. Suppose further that 
the scores have been standardized by calling the mean (average) 
of each set zero, and then dividing each deviation from zero by 
the standard deviation of the set. 

If there were no correlation between the x and y values (the 
scores in each test) the points representing the individuals would 
be distributed in a haphazard manner over the graph paper, 
that is to say, there would be a fairly even density of points on the 
graph paper, provided that we had taken results from a sufficiently 
large number of individuals.: If there existed some degree of 
correlation between the x and y scores, we should find that the 


Y 


Fig. 9. 
1 At first, students may obtain a better idea of the method of finding the line of 


best fit amongst the points by imagining that the x axis is removed and by consider- 
ing the points on the right side of they axis only. Sp ao dea at ta 


$241 T-T7-————————J— 


tangent of the angle which it ma 
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: r and were more dense in à certain 
are said to have made a scatter diagram or scatter gram 

Where correlation is present we can find a line whic h best fits 
the distribution of the points which we have plotted. Few, if asy, 
points will lie on it, but the line will go through the cluster. of 
points so that there is a ‘balance’ of points on each side of it. J 
would then be the line of best fit, and would be known as the fis 
of regression.! (Although this term is not particularly apt for 
psychological work, it is invariably used. It is a biometry term 
used by Galton to show that the average heights of offipriig tend 
to ‘regress back towards the mean of the race’.) 

Suppose that the correlation is a. perfect positive one. The 
points would be bunched together in the first and third quadrants 
and the line of best fit would make an angle of 45° with the 
positive x axis. If, on the other hand, there was perfect negative 
correlation, the points would be bunched together in the second 
and fourth quadrants, and the line of regression would be at 
right angles to that representing perfect positive correlation. In 


education and Psychology We usually find that correlation, if 
present, is partial positive correlation. "Thus we shall find the 
n of regression in the first and third quadrants (or if we are 
eH Ci cna 9r untreated scores upwards from o in the 


The slope of the regression line, that is its z value, or the 


Ere "MAPS 
the correlation coefficient.: “swith the x axis, is equal to r, 


qu 
$7! (= tan 455) 
1 The P 
The sum of the squares of the dig : 
pum ie jPPendix V, Of the points from the line shouid be 
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the two variables. As the correlation coefficient increases, our 
accuracy of estimation of the one variable will improve as shown 
in the table and diagram on page 52. It will repay the student to 
consider the significance of the correlation coefficient, as it appears 
in regression equations, and its use in the prediction of the amount 
of regression will give a clearer idea of its nature than that which 
comes from its calculation from formulae. This will also serve to 
explain the apparent paradox of the two regression equations such 
as y = rx and x = 7), for just as there is an uncertainty varying 
with the amount of regression in predicting an x value from a y, 
so also there exists a similar uncertainty in predicting a y value 
from an x. 

A reference to scatter diagrams will also serve to reveal whether 
correlation is linear. We shall see that although it is usually safe 
to assume that it is so, this is not invariably the case, and the line 
of best fit is then not a straight line.: Although the correlation 
coefficient is a measure of the degree of relationship between two 
sets of measures, it is not directly proportional to the degree of 
relationship. For instance, a correlation coefficient of -7 does not 
represent twice the degree of relationship given by a correlation 
of +35. It is also necessary to interpret the correlation coefficient 
in velative terms. A correlation coefficient of -9 would not be 
high in the case of two ‘paired’ and similar mental tests, whereas 
in determining the degree of relationship between a physical and 
a mental characteristic it would be difficult to find a value of r 
much greater than :5. It is common to speak of a value of r less 
than +3 as low, from -3 to +7 as medium, from -7 to -9 as high, and 
above +9 very high, but without reference to the meaning of the 
sets of measures which have been correlated, such terms may be 
entirely misleading. 

As we have already noted, the correlation coefficient enables us 
to predict with a degree of reliability which is known (and should 
be allowed for) the most likely value of a variable in one set, when 
that in the other set and the correlation coefficient between the 
sets are known. The diagram and table on page 52 illustrate 
this. 

! See page 62. 
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The Product- Moment Method of Calculating the Correlation Coefficient 
(sometimes known as the Bravais-Pearson Method*) 


From the general consideration of the degree of agreement 
between sets of measures or scores arranged so that the average of 
each is adjusted to be zero, which we have seen when we were 
dealing with regression, it is apparent that if there is no measure 
of agreement between one set of measures and another, the sum 
of the products of deviations of corresponding scores (carrying 
appropriate signs) from the mean, will tend to be zero. If there 
is a tendency for measures above the mean in one set to corre- 
spond to measures which are above the mean in the other (taking 
the mean as zero), and those below in one to correspond with 
those below in the other, it is obvious that the total of the products 
of the deviations of each score in one set and the corresponding 
score in the other will be a positive number. Thus, the product of 
the deviations will give an idea of the existence of a positive 
correlation. In the same way, if the positive deviations of one set 
tend to correspond with negative deviations of the other their 
product will be a negative number and will give an idea of the 
negative correlation between them. 

The exact formula, known as the Propuct-Moment or Bravais- 
Pearson formula for the correlation coefficient, which is written 
as r is 

ExXe 


nc 
Noo, 


where x, and x, are the deviations of the respective scores in each 
case from the mean of each set, N is the number of cases, e.g. the 
number of pupils in a class, and o, and o, are the standard 
deviations of the respective sets of scores. If the scores have 
already been standardized by dividing their deviations from their 
respective means by the standard deviations, the formula becomes 


1 Bravais, a French statistician of the nineteenth century, first used the idea of 
product-moments, and his work was improved by Galton. Karl Pearson (1857- 
1925), scientist and statistician, may be regarded as the successor of the latter. 
The name product-moment refers to the products of the moments (or the weights) 
of the scores in relation to their deviation from the mean. 
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22123 
N 


where z, and z, are the standardized scores, and further, if the 
scores have been normalized by dividing the standardized scores 
by 4/N, the correlation coefficient will then become r = Zsis; 
where s, and s; are the normalized scores. 

Where the correlation coefficient is calculated from the devia- 
tions x, and x,, the means will hardly ever be whole numbers, 
and the exact determination of Zx,x, is apt to be a laborious 
process. When calculating standard deviations we saw how it was 
possible to use an arbitrary or guessed mean which was a whole 
number, and if x, is now a deviation from an arbitrary mean the 
standard deviation 


r= 


2 palet (cod nică e i € 
" N N 
'The formula for the correlation coefficient therefore becomes 
EXER LIIS AE CO 
Bo EY 

: PUE DAN PERS zx. 
3 N N 2 JE A 

ample. Use the simple formula for calculating the correlation 
coefficient from the data given opposite. 


Ay 
r= Š 
N ox oy 


. as Zxy 
(oy Ee ap 
= 6035 
11,671 x 4/7616 
6035 
108-0 x 87-3 
-6401 Y/ 


| 
| 
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Example: CALCULATION OF CORRELATION COEFFICIENT BY 
PRODUCT-MOMENT METHOD USING THE SIMPLE FORMULA 


% Maths. | % Physics | X — Mean | Y- 
X we x Mean Y 
x y x y xy 
38 50 CERE wu 25 49 35 
39 44 -4 9 16 169 52 
61 62 18 5 324 25 9o 
49 54 6 zu 36 9 -18 
29 50 -14 ze 196 49. 98 
5I 72 8 15 64 225 120 
64 7o 21 13 441 169 273 
59 7o 16 13 256 169 208 
9 29 64 -14 7 196 49 -98 
10 27 60 -16 3 256 9 — 48 
II 19 74 -24 17 576 289 408 
12 61 60 18 3 324 9 54 
13 43 36 o =ar o 441 o 
14 1 48 — 32 -9 1,024 81 288 
15 42 46 - I -iI 1 121 1128 
16 46 70 3 13 9 169 39 
I 72 76 29 19 841 361 551 
18 62 42 19 -15 301 225 -285 
19 33 04 -10 7 100 49 -70 
20 40 40 -3 -17 9 289 51 
2i 37 62 -6 5 36 25 — 30 
22 39 52 -4 -5 16 25 20 
23 46 72 3 15 9 225 45 
24 71 78 28 21 784 441 588 
25 25 28 -18 -29 324 „841 522 
26 19 36 — 24 -21 576 441 504 
27 66 8o 23 23 529 529 529 
28 73 78 30 21 900 441 630 
29 52 60 9 3 81 9 27 
30 28 46 -15 -11 225 121 165 
31 53 64 10 7 100 49 7o 
32 20 64. -23 7 529 * 49 — 161 
33 56 48 13 -9 169 Br -117 
34 24 38 -19 -19 361 361 361 
35 57 64 14 7 196 49 98 
36 II 46 — 32 “11 1,024 121 352 
37 39 56 -4 -1 16 I 4 
38 60 72 17 15 289 225 255 
39 29 56 -14 -1 196 1 14 
40 27 32 -16 -25 256 625 400 
1,707 2,284 Totals 11,671 | 7,616 | 6,035 
Mean X | Mean Y 
— 43 ELY 
(rounded)| (rounded) 
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It will be observed that even with only 40 measures in each set 
and the slight inaccuracy introduced by taking whole numbers 
for the mean considerable labour is involved; tables of squares and 
square roots, logarithms and a slide-rule may be used to reduce 
the labour of computation. A calculating machine which will add, 
multiply, square (and if possible divide) is of great use where much 
of this work is done." 


X> 4 -3 -2.7| O +1 42 t3 *4 45 +6 Fy 


To avoid this type of calculation it is better to draw a scatter 
diagram of the data to be correlated and proceed as follows. 
(Often the data will have been given in grouped frequencies at 
the start and therefore the grouping of the measures in the form 
of a scatter diagram on squared paper is the obvious next step). 

1 An advantage of this method is that it is not necessary to turn the results back 


into terms of the original measures as the correlation is independent (or nearly so) 
of the size of the cell unit. 
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Here another set of measures has been grouped into 12 rows 
and 12 columns. These numbers need not have been equal but 
11 or 12 should be regarded as a minimum, otherwise the grouping 
will be too coarse and as the S.D.s calculated by this method will 
be too large the correlation coefficient will tend to be too small. 
The two sets of measures to be correlated are denoted by X and Y. 
Convenient arbitrary means are chosen and the deviations of each 
group of measures are given above and below the respective means 
taken as o. The figures in the cells give the numbers (fréquencies) 
of the cases with the corresponding X and Y values. 

It should be seen that the separate totals of the X and Y values 
each come to N (the total number of cases). 


Stage 1. To calculate the Standard Deviations 


We use the Standard method of grouped frequencies as given 
in Chapter II. 


Sy Frequencies x : Frequencies x 
Deviations Frequencies Deviations (Deviations): 
DE Js JEAN aby PI fex fy 
+6 o o o 
+5 o I o 5 o 25 
+4 I 2 4 8 16 32 
+ 3 3 4 9 12 27 36 
+2 4 3 8 6 16 12 
Tr 6 8 _6 8 6 8 
S 7 6 27 39 o o 
Sh 8 6 — 8 — 6 8 6 
—2 6 “2 —12 E 24 8 
RSS 2 2 —.6 — 6 18 18 
EL 2 3 — 8 —12 32 48 
ES I e NA — 10 25 50 
m z ENG 36 
N=40 40 —39 —44 172 299 


Xfex— —1i2 Byry=—5 
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Ex SENTIR. (S TUER 
NN ce N Z 
xf —5 Ai (242) = .016 
E Neg uo fe 
EET Eu for the numerator 


N 
of the correlation formula and in this case it is very small: 
A 1250971) 


= ues x ee y 


= V430 —:09 = 4/421 = 2.05 
Similarly c, = 4/6.97 — -016 = +/6.954 = 2.64 


Stage 2. To find the sum of the total x and y products 


The frequency (number of cases) in each cell must be multiplied 
by the product of its x and y values. This can be done by consider- 
ing each possible product, and finding the total frequencies of the 
cells with each value. It is obvious that any cell with a zero value 
for x or y will contribute nothing to the total. The cells may be 
crossed out in pencil as they are dealt with. The total frequencies 
should come to N. 

A table of three columns may be constructed to give respectively 
the possible xy products (those which are not represented by actual 
cases need not be written down), the frequencies, and the product 

Uc qux T. 
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» F f» 
o 10 o 
+ 1 3 3 
Seis 2 4 N 
deg I 3 
*o4 : t 
+ 6 5 30 
+ 8 1 8 
+ 12 I 12 
+ 15 3 45 
+ 20 I 20 
+ 24 I 24 153 
A 6 — 6 
— 2 I — 2 
a 3 I — 3 
— 8 3 24. 935 
N = 40 Zfxy = 118 


After the correction has been subtracted the numerator is 
2:95 — -037 — 2.913 
p= 2913 _ 2:913 


9x X Oy 2.05 X 2.64 


Another method, whi 
to apply the formula: 


= 54 
ich is sometimes simpler than the above, is 


Ox? + oy? — og? 
2 Ox X Oy 


are „the S.D.s as before and oy is a third S.D. 


r= 


where c, and o; 
calculated as follows 


! See page 69. 
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By means of a ruler or straight edge inclined at an angle of 
45° to the horizontal the number of measures falling in diagonals 
taken from the top left-hand corner to the bottom right-hand 
corner are considered. (Each diagonal will be at right angles to 
the line drawn from the top left-hand corner to the bottom right- 
hand corner. The first diagonal containing any measures will be 
that drawn from y = + 1 to x = —1 and this will contain two 
measures, the next from y — o to x — o contains no measures.) 

The measures will read as follows from the table on page 44: 


201168659100 1 making the correct total of 40. 


By choosing an arbitrary mean the S.D. is calculated as before 
and this will supply the value o; for the formula, which is then 
worked, 


Rank Correlation 


The product-moment method of finding the correlation co- 
efficient is undoubtedly the best way for use in scientific investiga- 
tions but when the number of cases to be considered is less than 30 
the method of ranks is just as reliable, and in some cases is even more 
so. The ranks (or orders of merit) in the two sets of marks or test 
Scores are written against the names of the pupils (It is usually 
convenient to write the names in order of merit in one subject 
and in a column to the right to add the correct order in the other 
subject with which we seek correlation.) The difference in rank 
is written in the next column and in a fourth this difference is 
squared. This column which contains only square positive 
numbers is then totalled. The difference of rank is called d, each 
difference is squared (d*) and these squares are summed Zd'. 

If we consider N pupils (or cases) it is easy to prove that if N is 
not too small, the sum of the differences of ranks squared which 

2— 
would result from pure chance: or probability would be m 
N(N — 1) (N +1) 
Op-————— i 
6 
1 See Appendix IV. 
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N (N*— 1) 
6 


[As N is a whole number, notice that is also a whole 
number.] 


The fraction of disagreement between two sets of orders of merit 
or ranks could therefore be expressed as follows: 


Sum of the actual difference squared 
Sum ofthe differences squared which might be expected by chance 


zd: 
EN (NI— 1). 


If this is a measure of the amount of disagreement of the ranks, 
the measure of the agreement or correlation may be written as 


zd: 


*~ EN (1) 


[by subtracting from unity] 


This correlation coefficient using ranks is written as p (the 
Greek letter rho). 


It is related to r, the correlation coefficient obtained by the 
product-moment or line of regression methods, by the formula: 


as De 

7 > 2sin 6p 
but this is only true statistically, i.e. in the long run and usually 
this transformation is hardly worth while. By using ordinary 
tables of sines the method is as follows:* Multiply p by 30°. Look 
up the sine of the resulting angle and double it. This gives r. 
This relation between p and r is only true on the average of many 
occasions. 


1 The angle x (radians) = 180°, = = 30°, 


Example: GALGULATION OF CORRELATION COEFFICIENT BY RANKS METHOD 
Name Rank in French Rank in History d* 
Ashley 283 264 4 
Ascough 25 24 1 
Beaumont 2l I 21 
Clifton 94 2 561 
Champkins 284 29 i 
Evans 19} 38 3421 
Foster 314 39 561 
Gill 38 6 1,024 
Gasper 191 7} 36 
Gray I 284 114 289 
Gray II 38 19} 3421 
Green 1 
Goodman 38 313 564 
Harrison 93 5 204 
Hawley. 334 15 3424 
Hill 224 31} 81 
Jackson _ 36 29 49 
Lymn * 13 64 
Marriot 164 194 9 
MacEwan 38 37 I 
Norman 314 35 121 
Norton 5 21 256 
Nelson I1 33 4621 
Newham 281 15 182} 
Newton 25 22 9 
Peak 161 29 1564 
Powdril 25 264 2i 
Pickersgill 2I 24 
Pillatt 13h 36 5064 
Rivers 164 174 I 
Robinson 19} 40 4211 
Shrewib 6 E: 
ewsbu 16 1 1 
Stafford ss) a i 
"Thornton m 113 = 
Walker 7 3 201 
Wilcox 7} 15 564 
Wright 35 24 12 
Warkinson 22] ak 225 
Wardle 23 10 56} 
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. Ed 
Ber NONO) 
ee 190i ^ 
4 X 40 (1599) 
6 20761 
—-1— ——- X 
40 X 1599 4 
= I — -486, 
Rank Correlation — .51 


A little consideration of the nature of regression lines will give 
us a clearer idea of the meaning of correlation than will come from 
an uncritical acceptance and use of the product-moment formula. 
It is sometimes thought that a correlation coefficient gives an 
exact measure in terms of a fraction or percentage of the agree- 
ment between two scores. It is indeed true that a correlation 
coefficient will give us a clue to the common elements which are 
contained in the scores. As we have seen by drawing lines of 
regression in scattergrams a correlation coefficient gives us an 
idea of the reduction of error in predicting scores in one test from 
those in another. 


„ Itis easy by using the formula for probable error to construct a 


= 67450, 4/1 — r 


qe 


EE-.3 4 -5 6 
VALUE OF F 
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à w^ c 


E 
Forecasting efficiency 


Fig. 13. The ordinates (vertical distances) give the forecasting efficiency for 
various values of the correlation coefficient. 


Correlation 
coefficient 
er 
00 
+10 


Forecasting 
efficiency 
o 
% 
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As r decreases the probable error of the estimation becomes 

greater. 

A/1 — r° is called the coefficient of alienation (Kelley). 
and is useful in that it gives us an idea of how high r should be for 
satisfactory prediction. 

When r = -1 the prediction is only 4% (-005) better than pure 
chance. With r of .8 we are only 40% better than pure chance 
and with r = -95 only 69% better off! 

It can be seen that unless r has a value of at Toast :8 the fore- 
casting efficiency will not be above 40% (ths) and therefore it 
will be of little value. With a correlation coefficient of -3 the fore- 
casting efficiency i is less than 5% or a twentieth. 

The correlation coefficient between two sets of scores is also 
equal to that proportion of the total variance which is due to the 
common factor (Variance is the square of the standard deviation: 
9), This may be shown as follows: 


Suppose that a set of c primary or elemental factors are con- 
tained in both scores x and y in addition to factors a which are 
contained in x but not in y and 5 factors which are contained in 5) 
but not in x. 

; ‘Thus, x=a+e 
l y=b+e 
and c is correlated with both x and y. 

Now regarding « and y as deviations from their respective means, 
the sums which equal these may be regarded as measured from 
their means. 

zy _Z(e+a (¢ +5) 

Nox oy Noc,a0c,5 
__ Ze: + Zeb Xca + Zab 
z Nocía9c,5 


Then r 


But since a, b and c are all independent of one another and 
further since each is given as a deviation so that the sum of each 
alone is zero the sum of the last three terms in he numerator 
will approach zero, 


5 


say 
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Thus re ee" ga 


Nocya Oe — Gc,aOc,5 


[It is an important property of variances that they can be added 
algebraically 


Thus, Oct + Oa? = Ocha 
or Es + ga: 
This is easily proved: 
2c? + 2a* 
i e + og = 
o + c. N 


But a and c are uncorrelated and are in the form of deviations 
from respective means. Thus 2Xac = o. 
Adding this to the numerator 


Ze: + 22ac +- Za: 


Oct + Og? = E EN 
_ Zic - a) A 
N > Sea ] 
Substituting in our equation for r 
oc 


Mac + oa? Aoc + os? 

Now assume that oa: = os:. This will be so if the factors that 
accompany the x variable are as potent as those that accompany 
the y variable (but are independent of the correlation) 
oc? cc 


Thus E VL S 
; Mac + oa? M oc "roa! Ocha 


The Correlation of Three Variables 


Sometimes we have three sets of correlation coefficients by con- 
sidering three sets of variables, or attainments taken in three pairs. 
It may be necessary to find the correlation between any two of the 
variables supposing that the third were kept constant. Such a 
case would be to find the correlation between school attainment 
and estimations of character with intelligence kept constant. 
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The partial correlation formula is as follows 
Tis — Tis X Tas 


V = rm) (1 — ra?) 

fıs 715 and ra are the correlation coefficients of the scores 1, 2 
and 3 taken in pairs. ras-a is the correlation coefficient of scores 
1 and 2 with 3 kept constant. 

As a further example we may consider correlation of age, 
height and weight. Let us call them x years, y inches and 
z lb. respectively. We can correlate them in pairs and find r,, 
r, and r, but each of these correlations is affected by the third 
variable. 

The formula enables us to calculate the correlation between any 
two, say x and », left uninfluenced by the third. 

In this case the correlation coefficient 7,,... 


Tiss = 


For convenience of reference we give the standard ertor now but 
this will be more fully explained in a later chapter. 


; I 
Standard error = — — 
N 


where r is the particular correlation coefficient which is 
required. 


Tetrachoric Correlation 

'T&rRAGHORIO CORRELATION means a method ofcorrelation using 
four groups (as the Greck name implies). In these methods we 
have data limited to the number of cases or the proportion of 
cases in each of two categories in each set. 

Suppose we have a number of pupils who are given tests in 
science and mathematics. We can divide them into four groups. 


a = Number above average in both science and mathematics. 
b = Number above average in science but below average in 
mathematics. 
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c = Number below average in science and above average in 
mathematics. 


d = Number below average in both science and mathematics. 


Science 
a b 
Mathematics |—— ——— 
c d 
Pearson's coefficient is 

be 

= cosine (= =) T 
Mad + A/ bc 


"The value of the expression within the bracket is calculated. This 
is multiplied by 180° and the cosine of the resultant ‘angle’ found 
from the tables. 


It will be seen that the total number of cases (e.g. the number of 
pupils) =a +b +c +d if we can disregard pupils who are 
exactly on the average line.* 


Example: In an examination taken by 40 candidates 6 were : 


above average in both science and mathematics, 14 were above 
average in science and below in mathematics, 14 were below 
average in science and above in mathematics, 6 were below 
average in science and in mathematics. 


1 When the divisions (i.e., the dichotomic lines) are at the respective means the 
formula simplifies itself, to 


(ad — be) 
N? 
where N = total number of measures = a dh esd, 


360° i.e. p = sin an (ad = be) 


p=sin 
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Science 


14 a6 
Mathematics | ——-—_| me] 


By the formula 


V36 ; 
pP = cos ( —— =) 180 
eor mer 
6x 180° 
8 as-pourzzs 3] 
20 


= cos 54° 
= -5878 
A modification of the above is sometimes useful as it gives a 


conservative (or even modest) idea of the intensity of association. 
It is known as the coefficient of colligation o» and is due to Yule, 


o = Vad = be 
Vad + be 
Using the same data as above: 
1984/30 a BBS 
4/196 + 4/36  14-- 6 20 
The Method of Unlike Signs due to Sheppard 


U = percentage of ‘unlike’ signs (that is, of cases with one score 
above and one below average in both tests) 
=b+¢ 
L = percentage of ‘like’ signs (that is, the sum of cases with both 
scores above or below average respectively) 
L + U = 100 (as U and L are percentages) 
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Sheppard's coefficient s = cos cm T 
180U 
Vsus BOS 
100 
= cos 1:8 U 


Thus, the percentage of unlike signs must be found, multiplied by 
1-8 (i.e. 2) and the cosine of this number regarded as an angle in 
degrees found from tables. 

In the example used for Pearson's formula above, the percentage 
of unlike signs is 13 x 100% = 30% 


s = cos (1.8 x 30)? = cos 54? 
= :5878 
which is precisely the coefficient which we found above. (This 
does not always happen but usually there is close agreement.) 


Coefficient of Association due to Yule 


From our tetrachor table we can measure the intensity of 
association between two sets of data using Q the coefficient of 
association 


ad — be 
Q= ad + be 
Using the same data as above 
_14X14—6x6 
Q I4 X 14+ 6 x 6 
E cates 00 
196 + 36 232 69 


This method produces a generous estimate. 


The table used in calculating tetrachoric coefficients is some- 
times called a 2 x 2 table 
Assuming that the scores, had they been known, were normally 


— 
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distributed in both arrays Pearson evolved the following formula 
for tetrachoric r, 

ad — be xxr. 

We = Te+ 
where x and x! are the Sigma-distances from the means to the 
points separating the proportion in the upper category from the 
proportion in the lower category, z and 2! are the heights of 
the ordinates at the points of division. 

a, b, cand dare the usual entries in the four cells, N is the number 
of cases, i.e. (a + b -- c 4- d). rz is the tetrachoric coefficient of 
correlation. It will be seen that r, is found by solving a quadratic 
equation and the correct solution will lie between zero and unity. 
Computing diagrams, which give r, by graphic methods when the 
contents of the four cells of the table are known, can be used to 
save much labour where many tetrachors have to be calculated.’ 


Biserial correlation 

Sometimes it is necessary to correlate sets of data when they are 
given in the form of two mutually exclusive groups in respect of 
one set and in numerical scores in respect of the other. Such 
dichotomies in the first set would be given by sex differences, 
married and unmarried persons, trained and untrained teachers, 
graduates and non-graduates, children of a particular age group 
attending school and those of the same age who have left school, 
etc. The following example taken from a study of a hundred boys 
and girls, sixteen to eighteen years of age who have left school 
and another group remaining at school will illustrate this.* 


The biserial coefficient of correlation is given by 


(Mj. — Mj) pa 


ZO, 


ny = 
y 

! See Brit. Journal Psychology, March 1949, No. XXXIX part 3: Also THURSTONE, 
Computing Diagrams for the Tetrachoric Correlation Coefficient, Univ.Chicago, 1933, 
and PEARSON KARL, ‘On the Correlation of Characters not Quantitatively Measurable’, 
Phil. Trans. London A. 195 (1900) 1-47. 

2 By Elwood Sones. ‘A Study of one hundred boys and girls sixteen to eighteen 
years of age who have left school and a similar group remaining at school" (according 
to size of families). The correlation between ‘Staying at School’ and size of 
family is only -176. 


60 STATISTICS IN SCHOOL 


(1) (2) (3) (4) 
No. of Children | Remained in 

in Family School Left School Total 

12 2 2 

FI 4 3 7 

10 4 2 6 

9 4 8 12 

=- 8 20 3 23 

7 10 17 27 

6 24 12 36 

5 18 18 36 

4 30 IO 40 

3 34 12 | 46 

2 34 IO |. 44 

I 16 5 21 
Means 4:57 5:31 4:82 


where M,, and M,, are the means of the third and second 
columns respectively, p is the proportion represented by column (3) 
(those leaving school), g ( = 1 — f) is the proportion represented 
by column (2) i.e. those remaining at school and o, is the standard 
deviation of the distribution in column (4), and z is obtained from 
the normal distribution curve tables for a ‘tail "of p [p = -33].* 
In the case of the above data we may work out the formula as 
follows 
r = (5:31 = 4:57) (33) (67) 
(2:57) (-3625) 
Provided that g is not less than -o5 the standard error of biserial 7 
(2 T r) 
is given by \ Z . The probable error is about $ of this 
VN 


or more exactly is found by multiplying by -6745. 


= +176 


1 Obtained from the table on page 91. The means are found by adding the 
products of the figures in column 1 with the corresponding figures in columns 1, 2 
and 3 respectively and dividing the respective totals by the sums of the numbers 
in columns 1, 2 and 3 respectively. 
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In finding biserial r no assumptions are made concerning the 
shape of the distribution, provided that it is not so distorted that 
the standard deviation is made to differ appreciably from that of 
a random sample and also that the two ‘tails’ of the distribution 
fit together to make a complete normal distribution. As Peters 
and van Voorhis have shown, it would not do to take the top and 
lower ends of a distribution and omit the middle (e.g. the hundred 
best and the hundred poorest teachers in a large number of 
teachers). There must be a proper dichotomy. 

It will repay the reader to try to find what lies hidden in 
each numerical coefficient of correlation. Quite apart from a 
consideration of the probable error of the coefficient as will be 
calculated from the formulae it is necessary to ask whether the 
correlation is real or fictitious. There would be a considerable 
degree of correlation between the heights of children and their 
reading ability, but both of these attributes would be dependent 
on a third hidden quantity — the age. Or again, there is a well- 
marked correlation between general ability and freedom from 
physical defects but as Spearman has remarked this may be due 
to a hidden factor of ‘psycho-physical’ energy. In certain aspects 
of science it is becoming increasingly difficult to state a cause a: d 
proceed from it to an effect, but mathematical analysis comes to 
its aid and can show the nature of the measurement of agreement 
between two sets of quantities. 

The lines of regression (the lines of best fit) which we have 
hitherto considered have been straight. The correlation has been 
spoken of as linear. But the quantities met with in psychology do 
not always correlate in this way. For instance, Webb’s character- 
factor w, known as persistence of motives or consistency of action 
resulting from ‘will’, correlates with perseveration f in the follow- 
ing manner: 


1 Another more subtle example is given by the apparent positive correlation 
between the intelligences of ‘only’ children and the ages of mothers bearing them. 
There is a tendency for highly cultured and intelligent women either to marry 
comparatively late in life or to bear their first child at a later age than average, 
High intelligences tend to be inherited. Here is the hidden factor, 
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£— 


P— 


Fig. 14. Non-linear correlation. 


Thus both high and low perseverators would tend to have low 
character scores, and the highest character scores would be 
associated with moderate perseveration. 


In this case we use a correlation ratio n (eta) which is given by 


+ E 


where o, is the standard error of estimate (the standard deviation 
of one of the sets of measures) and o, is the standard deviation 
of the other.' 


1 The calculation of correlation ratios is often a difficult and lengthy procedure. 
The reader is referred to GARRETT, Statistics in Psychology and Education for an 
example of this long method worked in a fairly simple way. 


(i) Results (out of 100) 
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SOME WORKED EXAMPLES OF STANDARD DEVIATION AND 
GORRELATION COEFFICIENTS a 


Arithmetic Test A given to 28 pupils aged 164- 


zb 


D 
36 
34 
26 


+ 296 


(ii) Select 62 as an arbitrary mean (iii) Median = 6: 


(by interpolation) 
(iv) D? 

1296 

1156 


:5 

s.o = Y 502.43 — 2.25 
= V500.18 
= 22.3 
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CALCULATION OF STANDARD DEVIATION (continued) 
ew F d Fd Fa’ 
9:9, o o o 
19-19 1 ES a 25 
20-29 2 -4 -8 32 
30-39 I r2 23 9 
Eyre 4 -2 -8 16 
50-59 3 7E =3 3 
60-69 3 o o o 
70-79 4 SET 4 
80-89 8 +2 32 
99-99 2 vu) 18 Tj 
N = 28 139 


This uses the same data as the previous example. It will be noted that the 
standard deviation calculated by the grouped iq ency method differs slightly 


from the correct result given by the longer metho: 


The distribution is skewed 


and hence the result calculated by the grouped frequency method is further 


affected. 
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Arithmetic Test B given to same 28 pupils aged 16+ 
(i) Results (out of 100) (ii) Select 60 a arbitrary mean (iii) Median —65 


(iv) D* 

8o T 20 400 

8o 20 400 

79 19 361 

75 15 225 

75 15 225 

70 10 100 

70 10 100 

70 10 100 

7o 10 100 

7o 1e 100 

68 8 64. 

66 6 36 

65 5 25 

65 5 25 

65 5 25 

63 3 9 

62 2 4 

61 1 1 

60 o= + 174 o 

57 EES 9 

56 4 16 

55 5 25 

52 8 64 

45 15 225 

37 23 529 

35 25 625 

32 28 784 

24 36 1296 

N = 28 — 147 5873 

+ 174 
Total = 1707 + 27 „ID: 5873 
N 28 
1707 xD 27 T 

Mean 28 N 28 209.75 
— 60.96 = .96 (M-A)* = (.96)* 
„9216 


* Mean = 60 + .96 , s.0 = ¥209.75—.9216 
60.96 
= 14.55 


3 
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CALCULATION OF STANDARD DEVIATION (continued) 


Class zFd _ 67 
Interval F d Fd | Fd? IN gana 3929 
9-9 o -6 o o SFd\? 2 
| Ee > tai 
10-19 o -5 petuo UC E E a 
| s.c = 10Y 2.3929 — .0625 
20149 : + pt 16 | — 10 X 1.526 
30-39 3 za -9 mo 19:26 
49-49 1 -2 52 4 | 
50-59 4 -1 en 4 
60-69 9 >» o o o 
^ 79-79 8 I 8 8 
€ 80-89 MZ 2 4 8 
N =28 +12 + 67 
— 19 
matt ici 
sy 


(Notice the slight difference in the result from that obtained by the long method 
of working, owing to grouping and skew distribution.) 
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CORRELATION BETWEEN ARITHMETIC TESTS A AND B 


Method (a) 
Rank Coefficient of Correlation (Spearman) 


Dif- 
E 6zd* 
9 Mark | Mark | Rank | Rank | ference p — NNE 
Name| inA | inB | inA | inB (d) d 
or Sketa 
a 44 70 22 8 14 196 28 (28 1) 
P 74 [21 12] | 18 st | 3oi 7269 
Y 28 35 26 26 o o = 1 — 38 x 783 
5 18 37 28 25 3 9 E 
etc. 88 8o 3 I 1d 24 Ge, 351 
3 82 70 8 8 o o = .669 
30 56 25 21 4 16 
66 65 15 14 I 1 
60 70 17 9 81 
76 | 65 11 14 3 9 
62 55 16 22 6 36 P 
86 66 44 | 12 74 | 56k 
80 75 9i 4h 5 25 
80 70 94 14 24 
84 63 6b | 16 9i | 90} 
26 24 27, 28 I H 
| 52 70 19 8 Ir 121 
52 52 19 23 4 16 
74 68 123 | 11 I$ 21 
86 79 4} 3 i 21 
44 62 22 17 5 25 
49 | 32 | 24 | 27 3 9 
98 8o I I + i 
52 57 19 20 1 1 
44 | 45 | 22 | 24 2 4 
96 60 2 19 17 | 289 
84 | 65 6} | 14 7i| s6i 
72 75 14 44| 9i! 908 
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Method (b). 
Produci- Moment Coefficient of Correlation (Pearson) 
SA Bla otita 
ame | in in x y xy xt PS 
iA ede 2i Vaz 
a 44 go |—20| +9 180 | yoo | 8&1 - DNE. 
p 74 GL ETO o o! 100 o V 14012 X 5847 
Y 28 35 |— 36 | — 26 | 936 1296 | 676 Sa SET 
5 18 37 | — 46| — 24 |1104 2116 | 576 203471 
etc. 88 Bo | + 24| + 19| 456 576 | 371 
82 7o | + 18| +9] 162 324 | 81 
30 56 | — 34| — 5| 179 1156 | 25 
66—|-65 +2| +4 8 4 16 
60 7o —4| +9 36 16 | 81 
76 65 |-+a2| +4] 48 144 | 16 
ih 62 55 =a] — 6 j| xal 4| 36 
86 66 | + 22] +5] 110 484 | 25 
8o 75 |+ 16 | 14| 224 256 | 196 
8o 7o |4-16| +9) 144 256 | 81 
84 63 | +20] +2) 40 400 4 
26 24 |— 38| — 37 |1406) 1444 |1369 
52 70. | — 12) +9 108 | 144 | 81 
52 52 |— 12| — 9| 108) 144 | 81 
74 68 |+ 10| 4c 7| 79 100 | 49 
86 79 | + 22| + 18 | 396| 20 484 | 324 
44 62. |— 20) t1 20 | 400 I 
40 32 | — 24| — 29 | 696 576 | 841 
o8 So |+ 34 | + 19 | 646 1156 | 361 
52 57 |—12| —4| 48 144 | 16 
44 45 |— 20| — 16| 320 400 | 256 
96 6o 32-1 32 |1024 I 
84 | 65 |+20| -F4| 8o 400 | 16 
72 75 + 8|-F 14 | 112 64 | 196 


N —28 Av. = 63.5 Av. 60.96 Zxy = + 6920 Ex* = 14012 
“Take av. ‘Take av. xy? = 5847 
tobe 64 tobe 61 
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Method (c). 
Product- Moment Coefficient of Correlation by Grouping and Diagonal Adding 
fd |16|18| 4| 4| 9| 3/16/7232 —165—26.04 (B) 
—— ——| = 138.96 
fd |—4|—-6|—2|—4| o| 3| 8|24| 8 275—729 
729 = 26.04 
d |—4|—3| —2| —1| of) +1] +2) t3 t4 28 s 
folic) az pcs: do me 87]. a] Nas 
Test| A 
f@ | jd|d|f 10-- |20-- |30--- 40--|s0-- |6o-- |7o-- |Bo-- |po-- | f| d fd |fa* 
go+ 
8o4- ye 1 |2|-F3|4- 6) 18 
| 70+ Kd NUS. US oj peo o paza 8| +2|+16| 32 
Aso 1 i i 20 a 1 lo] +1\4+ 9) 9 
50+ 1 d: ur 4| o| ojo 
d m DEL 
9 |+ 3|+3| x |4et H 1| —-1|—t1| 1 
|I —4—-— : X 
16 |+ 8| 2| 4|[3o | ‘1 1 1 3|—2|-—6|12 
EAM V EST 
2 Fa|ti| 2]|2ot 1| UE ic) pia M) 
I—Á—-|LI—lIL——i1-—-——EBÉ——————-4— ^7 
o o| o| 7|rot+ b jm 
SR ae p N-a8 +a 
19. | —1o|— 1 | 10 481 ^ (A) 
| 
12 |—6|—a| 3 240 29681 
EA 4a 
[Mas ic A | Ag = 1575 
al [Rr pe le 
=58 |= —6 N=28 po AtB=C 
—1.29 | 6*— 36 ——- VAB 
756.71 | 36 65.25 + 138.96 — 56.71 
3 = 1.2 = 95-25 130.90 — Se" 
ic) aB e 2V6s.25 X 138.96 
21775 


N.B. It will be observed that there are slight differences in the results between those obtained 
by grouping and those from columns of separate scores. “The number N which occurs in 
numerator and denominator of the final expression is omitted, 


70 STATISTICS IN SCHOOL 


Examples to be worked by the Student 
l Construct a frequency table of the following marks gained 
by a class in a test, in which the highest possible mark is 10:— 
5, 8, 9, 1, 7; 4» 2 3 5 3 6, 6, 7, 6, 6, 7, 45 45 7, 6,6, 5, 4 3 8, 
9, 10, 6, 6, 7, 8,7, 4 6, 2, 5» 5» 7, 8 4 5» 6, 5» 5 6 7,5. 3 
Draw a histogram, a frequency polygon and a cumulative 
frequency curve. = 


of The following are the numbers of children in the schools in 
rtain rural area:— : 


30, 47, 21, 23, 32, 15, 25, 41, 38, 56, 33; 32, 14, 25, 18, 37, 62, 54 
60, 31, 27, 26, 19, 34, 27, 43> 19; 51, 36, 28, 40. 

Taking class intervals of 5 make a frequency distribution table, 
histogram and ogive. 

If the figures given above represent an age range of nine years, 
calculate the probable number of eleven year olds, 1f85% of these 
can be expected to proceed to a modern secondary school about 
how many grammar school entrants will there be annually from 
this district? 


sof The following marks (out of a possible 60) were gained by 50 
stüdents in an examination:— 


31, 13, 20, 31, 30, 45, 38, 42, 30, 30, 30, 46, 
36, 2, 41, 44, 18, 26, 44,30, 19, 5» 44 15 
9, 13, 7,25, 12, 30, 6, 22, 24, 31, 15, 6, 
39> 32; 21, 20, 42; 31, 19, 14, 23, 28, 17, 53> 
22, 21. 
Construct a grouped frequency distribution table. Draw a 
histogram of frequency distribution. Calculate (i) median, 
(ii) arithmetic mean, (iii) standard deviation of the scores. 


“A mental arithmetic test was given to two groups of children. 
group consisted of 40 thirteen year old girls in a secondary 
modern school, and ‘B’ group of the same number of girls of 
similar age in a secondary grammar school. The results were as 
follows :— 


€ 
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Score No. of Girls | No. of Girls 
(out of 100) in ‘A’ in ‘BP 
90—100 I I 
80—89 2 2 
70—79 I 2 
60—69 PE 6 
50—59 5 10 
40—49 7 9 
30—39 6 6 
20—29 7 3 
10—19 5 1 
0—9 2 o 


` (a) Calculate the mean and median for each distribution. 
(6) Draw the frequency polygon for each distribution. 


(c) Comment on the results comparing the abilities of the two 
groups. 


10|15/20|25|30 35 40|45|50. 55/60l6s|70 75/80|85l90| 95 Total — 
to} to] to| to} to| to} to| to to} to| to| to| to} to| to} to| to| to No. of 
Marks — |15|20\25|30/35|40\45|50|55/60\65/70/75 80|85|90/95|100| Candidates 


Arithmetic 


Examination |1|1|2| 4,5|6|7|7|8|7/6|5|5|4|3|2|1| o 74 
English 
Examination 3 |9 |12|15|18|10| 7| o 74 


In an entrance examination to secondary school the marks 
obtained by a group of 74 candidates in Arithmetic and English 
were as in the table above. 

(a) Find for both examinations the mean, the median and the 

' mode. 

(6) Graph (using the same axes and making the curves as 

smooth as possible) the distribution of marks in each case. 


(c) Arrange the three measures (mean, median and mode) in 
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their order of interest to the teacher of the group keen on his 
pupils’ performances (give reasons for the order you 
choose). What other information (or fourth measure) 
would this teacher look for? Suggest how such a measure 
might be devised. 


6. The following scores were obtained in a school entrance test 
given to select 80 pupils from 200 applicants:— 


Score. | Frequency. | 
95:99 o 
90 — 94 4 
85 — 89 10 
80 — 84 25 
Wako 19 28 
TO pi Te 37 
65 — 69 38 
60 — 64 22 
35931059 I2 
OLEI 94 13 
45 — 49 6 
40 — 44 I 
Boe 39 I 
SO OF 3 
25 — 29 o 


(a) Construct the histogram and frequency polygon. 
- (b) Calculate the mean and median of the scores. 


(c) Make any relevant comments on the results of this test, and 
on its suitability as a basis of selection. 


7. The following table gives the frequency of football matches 
in which a certain number of goals were scored in a 5-week period. 


"Total no. of goals 
scored per match.. 0 1 2 3 4 5 6 


(BOO 
No. of matches .. 16 31 50 50 34 16 10 9 2 o 


0 E 
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Draw a histogram to illustrate this data. Find the median score 
per match. 

Give reasons why this histogram should differ from the curve 
of normal distribution and give another example of a set of 
statistics which might give a similar distribution. 


8. | Glass Frequency Frequency 
| Intervals. Group A. Group B. 
20 29 o | o 
910.39 2 9 
40 49 4 3 
5021059 4 4 
6o 69 6 6 
10 79 IO 9 
80 89 16 1I 
99 99 14 I 
100 109 | 10 9 
IIO 119 | 8 4 
120 129 2 6 
190 139 2 | 5 
140 149 o pese 
150 159 o 2 
160 169 o 2 
170 179 o 2 
180 189 o o 


The table above shows the frequency distributions oft two groups. 
A and B of people in an intelligence test. Decide for both groups" 
the class intervals in which fall the medians (M), and the lower and 
upper Quartiles.Q , and Q, (i.e. the measurements whose values 
are such that one quarter of the whole series is below Qu, and one 
quarter is above Q ;). 

Assuming the test is properly standardized, what conclusions 
would you draw about the comparative intellectual constitutions 
of the groups? Clarify your conclusions by plotting on the same 
chart both distributions in the form of column graphs. 
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9. From the following table of Intelligence-Ratios of 1,000 
children (a) draw histogram of the data, 
(b) calculate the arithmetic mean. 


Cumulative 
IR. Frequency. | Frequency. 

135-139 1 1000 
130-134 2 999 
125-129 10 997 
120—124 19 987 
115-119 22 968 
II0-I14 37 946 
105-109 71 909 
100—104 go 838 
95- 99 134 748 

= 90- 94 142 614 
y 8589 139 472 
8o- 84 132 333 
75- 79 89 201 
70- 74 6o 112 
65- 69 23 52 
60- 64 13 29 
55- 59 10 16 
50- 54 5 6 
45- 49 I I 

Total - 1000 


CHAPTER IV 


THE PROBLEM OF ERROR 


But to us probability is the very guide to life. 
BISHOP BUTLER — Analogy of Religion 


have been handling very large numbers of cases it is necessary 

to consider what happens when we make our experiments 
with smaller samples. It is obvious that the statistical laws which 
we use will be free from errors to an extent related to the number 
of cases which we can investigate. A very simple example will 
suffice to show this. If we toss a penny a sufficiently large number 
of times, say 100,000, we should expect the ratio of heads to tails 
to be 1 to 1 with a very tiny possible error in the 1 : 1 ratio. If 
we toss the coin only 10 times it may happen that we get 3 heads 
and 7 tails but in the case of 100,000 trials the chances of getting 
30,000 heads and 70,000 tails are so exceedingly remote as to have 
no statistical interest for us. In other words as the number of 
trials gets larger and larger the ratio of heads to tails approaches 
nearer and nearer to its true limit. 

The problem before us now is to try to find just how reliable are 
the results of our investigations on various numbers of cases. 
An ordinary school class may contain no more than 25 or 30 
children. Again, when we have to deal with rather lengthy 
investigations it is necessary to limit the number of cases con- 
sidered in order that the research can be completed in a reasonable 
time. 

Thus, all the investigations on a metrical basis which we make 
in psychology and education will have to be qualified by an 
estimate of the size of the error which is likely to arise, and we 
shall have to consider its size in relation to the size of other factors 
concerned, as a correlation coefficient, for instance. In the 
analysis of variance, that due to error may be compared with the 
variance due to other causes under consideration. 


A s the results which we have obtained so far assume that we 
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It is clearly impossible to take such large samples, in normal 
procedures, to ensure that each sample is a true cross-section of 
the entire population, Suppose that we have been finding the 
correlation coefficients between two sets of scores in subjects A and 
B and that we have been able to continue our investigations with 
a large number of similar groups of children. We should not 
find the correlation coefficient to be quite the same in any two 
groups of children owing to errors of sampling; we should find a 
central tendency in all the correlation coefficients and it would be 
apparent that the correlation coefficients would satisfy the normal 
law of distribution. To find the probable error we should want to 
know how far from the mean or central value of the correlation 
coefficient is the line which divides one-half of the coefficients 
from the rest. If the dispersion were great compared with the 
value of the correlation coefficient, that is, if the P.E. were more 
than a small fraction of the correlation coefficient, we should 
regard the latter as being unreliable. 

Investigators trained in the physical sciences tend to reject any 
results where the correlation coefficient is not more than four times 
greater than the probable error, but a less rigorous attitude has 
prevailed in psychological investigations and results which are 
no greater than three times the probable error are accepted as 
being significant. Even these should be treated with great 
caution and the investigation should be continued with further 
critical exploration of method and data. In writing down à 
correlation coefficient or other result we should therefore add the 
value of the probable error. 

Probable error is another term for quartile deviation or the 
semi-interquartile range. Usually, however, the term quartile 
deviation is only applied to simple measures and probable error is 
used with derived or secondary measures, as for instance standard 
deviation or the correlation coefficient. The obvious way of find- 
ing the probable error would be to arrange the measures in order 

„1 An example of the difficulties which beset sampling performed even with con- 
siderable numbers under scientific conditions which satisfy all statistical demands 
is a recent failure of the Gallup polls to predict an American election result. It is 


admitted that its use for prediction is always more risky than for the analysis of 
fairly stable conditions. 
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or to count them and to take half; but more often the probable 
error is found from the standard error (or deviation) and the use 
of the formula P.E. = -67450 (i.e. -6745 x S.D.). 

It may be well to examine the meaning of the word probable. 
If we say that ‘It is probable that it will rain tomorrow’ we 
really mean that the chances that it will rain are more than 
those that it will keep fine, that is to say that the chance is per- 
ceptibly greater than a ‘50-50° chance. The expression ‘probable 
error’ though time-honoured is misleading and really means half 
the measures on each side of the central point. (A rough approxi- 
mation is that probable error = $ X standard deviation.)* 


Probable Error of Mean = -6745 VN 


Probable Error of Standard Deviation = -6745 VN 


TES 


Probable Error of Correlation Cocfficient r — -6745 VN 


The reader must not be misled by the use of the word probable,* 
and the formulae simply give the chances that the mean or other 
derivatives will lie within a certain distance of the true value. 
In the case of the mean the chances that it lies between + pro- 
bable error and — probable error are 1 to 1. The chances that 
it lies inside the limits become greater as the limits increase: for 
instance 


1'These matters will become clearer when the chapter on the Normal Curve is 
read. It should be remembered that the relation between standard and probable 
errors only holds if normal distribution of the errors can be assumed. 

2 The popular treatment of probability in terms of ‘odds for’ and ‘odds against’ 
should be qualified by a more systematic mathematical treatment. Here ‘certainty’ 
is denoted by a probability of 1 and an ‘impossibility’ by a probability of o. The 
mathematical probability of an event lies between o and 1 and may be expressed 
as a fraction, decimal fraction or a percentage. 


If the probability that an event will happen is given by the fraction = (ie, 1 
chance in x and not 1 to x), the probability against the event happening will be 


X 5 x-i 
1 — = or the fraction ——. 
E x 
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between — P.E. and + P.E. the chances are I toI 
— 2 P.E. and +2 PE. ,, 3 » 4:5 to 1 
Pate hy anderes PE. s uen 4 21 t0 I 
— 4 P.E. and + 4 P.E. ,, pă 5 142 to I 
— 5 P.E. and 4- 5 PE. ,, P »  191O to I 
— 6 P.E. and +6 P.E. ,, Ep » 19,200 to I 


The chances that the mean lies outside these limits is given by 
interchanging the figures in the two right-hand columns. 
The chances expressed in terms of standard deviations: 
Frequencies of devia- Odds against devia- 
lions outside these tions falling outside 


limits these limits 

+ P.E. = + -67450 2x 25% I tor 
c 2 X 159% 2 to1 

+ 20 2 X 2:28% 21 to I 

c 30 2 X 135% 370 to 1 

+ 40 2 X -0032% 15,600 to 1 


The standard error (or standard deviation) does not tell us how 
much our result is in error but rather the chances that the result 
has an error of a particular magnitude. 


Summary of the Probable Errors of Correlation Coefficients 


r is the correlation coefficient found by the product-moment or _ 
line of regression. 


I—r 

P.E. = .6745 —— 

745 JN 

p (rho) is the correlation coefficient found by rank method 


Pun o 
P.E. = 706 + 
/N 
When the true value of r is zero, that is when there is no correla- 
tion between two arrays of scoressthe formula for standard error 
ofr 


I 
is or > — 


VN 
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or for samples of about 30 or slightly less 


I 


~ JN—1 


Or 


e.g. In the case of N = 26 
or = -2 ` or slightly more. 


It is left to the reader to calculate the probabilities that apparent 
correlation coefficients of 4-.2, +-4, 4-5 may occur even when the 
true value is zero. 

In a moderately large sample if r is equal to its standard error 
the odds are about 4 to 1 that there is a degree of correlation 
between the two sets of numbers, if the ratio is 2 the odds are 
43 to 1 and if 3 the odds are about 740 to 1. 

It will be noticed that in each case the denominator contains 


AN, the square root of the number of cases considered. The 
consequence of this is that if we quadruple the number of cases 
(e.g. consider 120 pupils instead of 30) the probable error is 
reduced by a half, and it will be reduced to a third if the number 
of cases is multipied by nine. The actual expression under the 


root sign is 4/N — 1 but when N is sufficiently large it is customary 
to write 4/N which is near enough for most practical purposes. 


e.g. (a) find the P.E. where r = -9, N = 36. 
6 25101 
PE, = 9745 (1 = 9") 
v36 
— :6745 X -19 
6 
= -0213 
r = 9 + -0213 


In writing the probable error in this way it must be remembered 
that the P.E. is given as a probability and not as an actuality, 
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(b) find the P.E. where r = -4, N = 16. 
gr Oat) 


PE, = 
V16 
_ 8745 x -84 
i - 
= +142 
r= 4 + +142 


Here the P.E. is more than a third of the correlation coefficient. 
The latter cannot therefore be considered reliable or even 
significant. It would have been better in the investigation to have 
used all possible means to take a greater number of cases than 16. 

P.E. of tetrachoric r where the dichotomic lines are at the means 


:6745 (2m Vic a [e + d) (c + Df 
/N 4N* 

and where true 7 = 0 

“6 745m 

2V/N 

The probable error does not give a very good estimate of the 


reliability of r when N is small and r is large. Accordingly Fisher 
has suggested that r should be replaced by its hyperbolic arc- 


P.E. = 


1 The nature of the ratio between a coefficient and its standard error or deviation 
must be carefully considered. The figure which is taken, really means that the 
chances that the coefficient has no significance are reduced to such an extent that 
we have reason to believe that there is good evidence of significance. "There is no 
case of conclusive proof. As a figure equal to twice the standard deviation only 
occurs about once in 22 cases Fisher suggests that this may be regarded as signifi- 
cant. As probable error is about $ X standard error or deviation, Fisher's sugges- 
tion is that 3 X probable error would be a significant quantity. 

McCall has suggested a ratio of 2.78 X standard deviation (i.e. about 4.17 prob- 
able error), but this is larger than we usually find in psychological and educational 
experiments even when other considerations lead us to believe that there should be 
significance and some notable degree of correlation between our figures. 

Peters suggests that a figure somewhat less than that of Fisher's may be per- 
mitted. He takes the point on the probability curve where it bends to a maximum 
degree as the distribution thins out to a long tail. This gives a value of 1.73 X S.E. 
or 2.6 x P.E. and for this he proposes the term working ratio. - 

In each case the student should fortify himself by finding what is the extent of the 
probability from the tables of the integral of the normal probability curve and it 
should be kept in mind that probability does not imply certainty. 
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tangent tanh ~* r which he calls z' and for which he provides 
tables. 


tanh ^! r = z! = 4 [log, (1 + r) — log, (1 — 7)] 


= 2993 [logio (1 +7) — logi, (1 — 7)] 


Many experimenters would feel that results obtained by investiga- 
tions with less than 25 cases would be so unreliable as to be of 
negligible worth and where any rigorous research was undertaken 
a hundred cases or more should be considered. 

In this book we have not only dealt with standard errors but 
also with ‘probable errors’. Modern practice quite rightly tends 
to abandon the use of P.E. wherever possible. We have used P.E. 
here in an introductory way as its nature can easily be grasped by 
considering a rank or order of merit. No confusion need exist 
in the Student’s mind in view of the simple numerical relationship 
between S.E. and P.E. 

Other standard errors which are useful in educational research 
are as follows: 

Standard error of a difference between the averages of scores 
which are intercorrelated. If we wish to consider the significance 
of the difference between the averages of scores in two tests or in 
repeated tests taken by a single set of persons 


S.E. =o, = ESTA, + Da? — 270102) 


or if c, and c, are taken to represent the S.E.s of the means of the 
original scores and not the S.D.s of the original scores 


9, = Voi + 0$! — 210,0, 
In view ofthe differences which arise through errors of sampling 


the average of a sample may vary from the true average which 
would be found if we were able to take a very large number. 


The S.E. of the mean or average o, = er 


where c is the standard deviation of the original sample. 
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In the same way differences in the nature of samples (‘errors of 
sampling’) may cause errors in the S.D.s of a sample. 


o 
The standard error of a standard deviation o, = —— 


V 2N 
The standard error of a difference between two standard 
deviations is equal to 


9; a 
+ — 
NEUE 1 
where o, and c; are the d ee PE and N, and N, are 
the numbers of cases in the respective groups or sets. 


Standard error of a percentage and of a difference belween percentages. 
If x is the percentage then 


x(100 — x) EN [EISE 
NOR N 


and the standard error of a difference between two percentages 
x, and x, is 


Standard Error of x — "i 


x,(100 — xı) , Xa(100 — Xa) 
N, Na 
The formulae are most useful in finding the numbers of cases 
which it is necessary to investigate in order to be certain that 
percentage differences between groups are significant, e.g. It 
appears from dental records that 40% girls and 43% boys at 
certain schools are in need of dental treatment. What is the 
minimum of children which we must take in order to make sure 
that the 3% difference is significant? 
If the difference of 3% is reliable it should be more than 3 times 
its S.E, and even at 2 x S.E. the chances would only be about 
1 in 21 against its significance 


*. S.E. should not be be greater than 176 than 1% 


; REEL 43 X 57 
of ore 


= 4851 a 
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Thus to make sure that the 3% is significant (chances 370 to 1 for 
significance which is large) the investigation should be based on the 
examination of 4851 (say 5000) boys and an equal number of girls, 

The standard deviation of the skew of a distribution may also 
be useful occasionally. 

:5185 D 
VN 
where D = P,,— P;, 
(i.e score at goth percentile — score at 10th percentile) 


sk 


Note on Students ‘t: 


Student's ‘?’ is defined as 
x 
fm 
Ox 
where x is the deviation of a measure from the true value which 
is assumed from a normal distribution and ox is the standard 
deviation of all the measures in the sample. Student worked out 
the distribution of ¢ (which he originally called z) and found that 
it was particularly useful for working with small samples. At first 
Student carried his table only to N = 10 and found that the 


Pa vet I 2 
standard error of his distribution was VN and later Fisher 


developed the table in terms of N — 1 degrees of freedom. Most 
of Fisher's tables are constructed so that a probability of 5% 
(odds of 20 to 1) is significant and a probability of 1% is highly 
significant. In the case of a normal distribution (n very large) 
probability of 5% corresponds to a ¢ of 1.96 and a probability of 
1% corresponds to a ¢ of 2.58. 


Test Reliability and Test Length 


If, after a sufficient interval, a test is applied again under 
similar circumstances there should be a high degree of correlation 


1 ‘Student’, whose real name was William Sealy Gosset, died in 1937. He was a 
senior member of the brewing firm of Guinness in whose service he developed much 
PE statistical work. He chose his pseudonym out of respect for the ‘master’ 

earson. E 
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between the two sets of scores. Moreover, if the test is a good one 
it should be largely independent of the qualities and skills of those 
administering it. 

Ifa test is reliable it can only be so if it is thorough and this will 
depend to a large extent on its length. If tests are supplied in 
double form so that there are two parallel tests, a re-test with the 
second set should produce results with a high degree of correlation, 
that is, upwards of -9, with the first set. When two similar tests 
are not supplied, a single test is converted into two by taking the 
odd-numbered questions as a shortened first test and the even- 
numbered as the second test. By shortening the test its reliability 
is also reduced and therefore it is necessary to have some means of 
predicting the reliability of a test if it were lengthened. 

Suppose r is the correlation coefficient of the results of the two 
halved tests. Then if R is the correlation coefficient between the 
complete given test and an imaginary one of similar type 

CRY 
mS 
In a general case, where a test is imagined to be lengthened s 
times, we may use the Spearman-Brown prophecy-formula: 
ee ee 
—i1+(n—1)r 
(of which the formula for the doubled tests is the simplest case). 
We can calculate the reliability or the limits of variation of 
individual scores when we know the reliability coefficient. 
Probable error = -674509 4/1 — r* 
e.g. if there is a correlation of .95 between intelligence tests and 
the standard deviation of the intelligence quotients is 15 then 
P.E. of LQ. = -6745 X 15 X 1/1 — -95° 
= 3:1 


This means that about half the people taking the second test will 
have I.Q.s which differ from those which they obtained in the 
first test by little more than 3 points. By considering the way in 
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which the expression 4/1 — r? becomes larger as r becomes smaller 
the student will see how rapidly the probable error increases as 
the reliability coefficient r drops below -9. Unfortunately an r of 
*95 is exceedingly rare. It should be added that the reliability of 
a test will appear to be lower than it can be taken to be, if it is 
given to groups which are too homogeneous and therefore do not 
permit proper sampling both in respect of age and abilities. The 
difference in reliability as given by tests with two groups of 
different 'spread' (i.e. homogeneity or heterogeneity) is given by 
the formula 
Ra 9007) 

[Pid 
where R is the reliability to be expected with a group with 
standard deviation of I.O . =o, and r the reliability with a group 
of S.D. o.. 


CHAPTER V 


THE NORMAL CURVE OF DISTRIBU- 
TION AND ITS USES 


curve and we have already noticed it when we were 

considering the distribution of measures with respect to 
a central tendency. It is now convenient to consider more care- 
fully the nature of this important curve. For the reader who can 
deal with simple calculus some of its mathematical properties have 
been worked out in Appendix III. For the purposes of the present 
section it will suffice if we examine the shape of the curve and 
know the meaning of the heights of various lines drawn vertically 
in it and the significance of areas bounded by the curve and cut 
off by such lines. The quantitative aspects of such lines and areas 
will be given in simple tables. The curve is sometimes called the 
Laplacian or Gaussian curve in honour of Laplace and Gauss who 
respectively used it in their work on probability, For reasons 
which will be apparent it is also called the probability curve or 
curve of error. One of its most fruitful early uses was to deal with 
experimental errors in astronomical observations. 

A word of warning must be uttered concerning the use of the 
so-called ‘normal’ curve. Too often in the past the adjective 
‘normal’ has been misused. The distribution of the velocities of 
molecules of a gas, or that of the quantitative measures of errors 
in respect of many physical observations may under certain conditions 
where there are no biasing factors conform to such a curve. Even 
here the mathematical theory of pure chance in the distribution 
usually preceded any attempt to check its validity, which has to 
be assumed without experiment in many cases. In the case of 
‘mental measurements’ the matter is much more difficult. We 
have no theoretical basis for expecting such distributions, and in 
fact factors can be imagined which may cause skewing. In an 
intelligence test scale we are not dealing with the physicist’s 
‘class A’ measures such as length, speed and mass. We can obtain 


G 87 


M* students are familiar with the well-known bell-shaped 
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a length of 130 cm. by adding one of 70 cm. to another of 60 cm. 
We cannot obtain an I.Q. of 130 by adding one of 7o to another 
of 60. Each I.Q. must be referred separately to an arbitrary 
scale. It would be foolish to assume that there is a fundamental 
‘law of normality’ which applies to most sets of educational and 
psychological data. Most of the groups and samples with which 
we have to deal in psychological research are only defined in a 
vague and ambiguous manner and the degree of homogeneity in 
traits other than the one which we are considering is seldom 
sufficient to eliminate their effect. 

It is impossible to talk about the form of a distribution being 
normal with any meaning unless we specify the type and classifica- 
tion of the individuals concerned. 

Certain physical characteristics such as weight show reasonably 
good normal distribution for individuals of the same sex, race, 
age and height, but even here the curve is negatively skewed, as in 
'normal' times excessive overweight is more common than 
excessive underweight. The use of the word ‘normal’ whether it 
describes the times in which we live, a person’s behaviour, or a 
distribution needs careful consideration. This is not to despise 
its use in educational research, but the early use of the distribution 
to deal with errors and deviations from a mean is still the most 
useful. A curious example of ‘circular reasoning’ sometimes takes 
place with respect to intelligence tests. Such tests are usually 
devised to give a ‘normal’ distribution of the scores with certain 
population classes. It is to be expected therefore that when they 
are applied to the testing of similar population classes the distribu- 
tion should be normal. The symmetrical bell-shaped curve is 
useful because it is susceptible to easy mathematical treatment, 
but here again we must not be ensnared by the attempts which 
mental testers have made to give numerical assessments of 
intelligence along a scale of numbers. This scale has none of the 
properties of a graduated rule or length. The boy with I.Q. 130 
is not twice as intelligent as a boy of I.Q. 65. There is in fact 


1 A distribution which does not conform to the ‘normal curve’ may be quite 
normal in the usual sense. In educational measurements and calculations the words 
“normal distribution! refer to the curve, 
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hardly any means of comparing these individuals; the first able 
to benefit by Grammar School teaching and the other practically 
a moron. The “man in the street who said that the first was 
*a thousand times as intelligent" as the second would, in spite of 
exaggeration, have the germ of truth in him. 


A 


N 
N 
/ 


Fig. 14a. 

'The mean, mode and median of the curve are equal and are 
marked y, on the central axis of y, about which line the curve is 
symmetrical. The area of the curve represents the total number of 
scores or measures which are distributed. By drawing vertical 
lines we can measure the areas enclosed by the curve which are 
cut off by them. "These represent the numbers of scores which are 
beyond or within a certain value of the score. 

If there is good dispersion of the scores the curve is wide and 
well-rounded, but if, on the other hand, there is not much 
dispersion and the scores deviate but little from the mean, the 
curve is thin, sharp and pointed. 

It will be observed that at points on the curve, known as points 
of inflexion, the convex shape of the top part of the curve gives way 
to the concavity of the lower part of each side. These points are 
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at a distance c (standard deviation) on cach side of the central 
point. 

The curve is said to be asymptotic to the axis of x (that is the 
horizontal base line). This means that the curve approaches this 
line if it is sufficiently extended at both sides. Tt is said to meet the 
line “at infinity’. The standard deviation o is a convenient unit 
for measuring distances along the x axis. Exceedingly little of the 
area of the curve remains at distances greater than 30 on each 
side of the central line. 

It is convenient to reduce all distances along the x axis to 
sigma-units by dividing the x distances by c. 


The amount of the area enclosed by the whole curve lying between 
verticals at distances of o on each side of the central line is 
68.26%, 

"That enclosed between verticals at distances of 20 on each side of 
the central line is 95:447, + 

and that enclosed between verticals at distances of go on each 
side of the central line 99.7597. 

The following table gives the proportion (percentage) of the 
total area under the normal curve between the central line (mean 
ordinate) and an ordinate (vertical line) at any given distance (in 
sigmas) from the mean. 


NORMAL CURVE OF DISTRIBUTION gr 
TABLE I 
PER CENT OF TOTAL AREA UNDER THE NORMAL CURVE 


BETWEEN MEAN ORDINATE AND ORDINATE AT ANY 
GIVEN SIGMA-DISTANCE FROM THE MEAN 


| 
| 


al: 


ees - WN ess T--- opoco ccocc 
| Sone pouan aibe bouon sune vxusuo aie 


LET 


mawe RNIN 


The hext table gives the ordinates (the vertical heights) under 
the normal curve at various x distances (in terms of standard 
deviation) from the mean. The ordinates are given as proportions 
of the mean ordinate, that is, the greatest height of the curve. 
Such a table is useful if we desire to find the frequency at a certain 
point, e.g. the number of cases with a certain score. 
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TABLE II 
ORDINATES UNDER THE NORMAL CURVE AT VARIOUS SIGMA- 
DISTANCES FROM THE MEAN (ORDINATES EXPRESSED AS 
PROPORTIONS OF THE MEAN ORDINATE) 


= „00 „01 „02 „03 „04 .08 „06 „07 „08 „09 
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This table is useful when values have to be fitted to a curve. 
The area table can be used as follows: 

1. It is consulted if we wish to find the number or proportion 
of cases in a normal distribution which lie on one side of a point 
along the scale. 


Example: An I.T. set of scores have a mean of 100 and S.D. of 1 5. 
Find the percentage of scores which lie above 120. 
This score of 120 is 20 above the mean 


: ; 20 4 
or in terms of sigma-scores F or 1-333 above the mean. 


NORMAL CURVE OF DISTRIBUTION 9 
From the table we see that E value of 1:33 gives a percentage 


of 40-82 for the area between the mean ordinate and the given 
one. (By interpolation we get the value of 40-88 for 1-333.) 

As the curve is symmetrical about the mean ordinate 50% of 
its area lies above (to the right of) this line. 

Thus the percentage of scores which lie above 120 is 
(50 — 40-88) % = 91275 

To convert this to an actual number we should multiply the 


total number of cases by s ) 

2. It is easy to extend the above to find the percentage of or 
number of cases which lie between two points on the scale. The 
process outlined in (1) is repeated in respect of both points and 
a simple subtraction gives the required result. 

3. The table may also be used to find the point on the scale 
above or below which a given number or percentage of the cases 
in a normal distribution lie. This is the reverse of (1). 

Suppose 15% of the cases lie above the required point. Then, 
considering only one side of the curve (50 — 15) % or 35% of the 
cases will lie between it and the central line. We therefore search 


in the body of the table to find an value corresponding to this, 


The value is therefore 1-036 (by interpolation) and if o = 15 
the required point is 1-036 x 15 along the x axis. 

If the mean is given by 100 this point will be 100 + 1:036 x 15 
= 115:5. 

This type of calculation may be extended to find the x distance 
on each side of the mean which cuts off a certain middle propor- 
tion of the cases. We can divide this proportion by a half and 
work on one side of the mean only, thus taking advantage of the 
symmetrical properties of the curve. 

4. The curve may also be used for finding certain probable 
values and for obtaining an understanding of what is meant by 
probable error. There are various arithmetical ways of expressing 
a probability. If we say that ‘it will probably rain tomorrow' we 
mean that the chances of rain are greater than those that it will 
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keep fine, that is, slightly more than the 1 : 1 or even chance. 
The probability is rather more than } or 50%. In the case of the 
‘normal curve’, probabilities are measured as ratios or percentages 
of a particular area compared with that of the whole. If the ratio 
or percentage is a small one the probability is correspondingly 
small. For example, a probability of 24% would be 1 chance in 
40; a probability of 98% would be 49 chances in 50. Statistics is 
full of probabilities and the student should try to think in these 
terms. Probabilities are not certainties but refer to what is likely 
to happen in the long run and with a sufficiently large number 
of cases. Even though the chances that an event will happen or 
that a result is significant may be very much greater than the 
chances that the event will not happen or that the result is not 

significant, there is still an uncertainty. Many of the so-called 
‘laws of science’ are to be thought of as being true to the extent 
of a large probability based on the results of a great number of 
observations. Probabilities of a sequence of chance happenings 
are subject to the rules of the behaviour of a single happening 
and no further prediction can be made. For instance, if we toss 
a penny four times and four successive ‘heads’ result, the proba- 
bility that we shall throw a *tail on the fifth toss is no greater nor 
less than it was at the start. It is still an ‘even chance’, i.e. a 
probability of 4 or 50%. 

Suppose that the curve represents ‘errors’ or deviations from 
the mean. If we divide the area of the curve into halves by taking 
the “middle” half of the scores we shall have 25% of the measures 
on each side of the mean line. The chances are even that any 
measure selected at random will lie within the ‘middle’ half of 
the scores. 

We can find the distance of the x value which marks the 
boundary of the 25% of area by consulting the table. A rough 


value ist is 67, but by interpolation or by consulting a book of 


statistical tables we can obtain a more accurate value. We find 
that the chances are even (the probability is 3) that any measure, 
score or error selected at random from a normal distribution will 
deviate from the mean by more (or less) than 67459. 


J 
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TABLE III 
PER CENT OF TOTAL AREA UNDER THE NORMAL CURVE 


BETWEEN MEAN ORDINATE AND ORDINATE AT ANY 


GIVEN P.E. DISTANCE FROM THE MEAN! 
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is distance along x axis divided by probable error. 
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-67450 is called probable deviation, and a probable error is 
-6745 X standard error. 

A third table gives the areas of the normal curve under certain 
values of x expressed in terms of probable deviation instead of 
standard deviation (sigma o) values. As is to be expected, 25% 
of the area on either side of the central line gives an PE value 
of r. 


Fitting a Normal Curve to a Series of Measures given in the form of a 
Frequency Polygon 

It is better to draw the histogram or frequency polygon on 
graph paper to a suitable scale so that the paper is comfortably 
filled. The S.D. of the measures should be calculated after they 
have been grouped into frequencies. 

(1) The height of the normal curve (see Appendix III) may be 
calculated from N 


Ze o4/ 2T 
when N is the number of measures and o is the standard deviation. 

(2) The mid-point of each interval should be calculated 
in terms of sigma units by dividing each x value by the standard 
deviation. 

(3) By using Table II the heights of the ordinates at each of 
these points is calculated. The table gives these values as a pro- 
portion of this ordinate and the actual heights are found by 
multiplying the height of the normal curve (mean ordinate) by 
the figure found in the table. The curve may then be plotted 
by joining the tops of the vertical ordinates with a smooth curve. 

Inevitably there will be discrepancies between the actual 
ordinates and those obtained from the perfect curve. The sum 
of the theoretical frequencies of the curve should always be slightly 
less than those of the given distribution. The probability that a 
given distribution has discrepancies (which make it differ from 
a theoretical distribution) which are not due to chance can be 
found by using Chi-squared and consulting the appropriate tables. 


| 
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The curve has some other uses in educational statistics. It can 
be used for setting standards for the distribution of marks, to 
assign values of difficulty to questions in a test, to give numbers 
of pupils in equal ability or talent ranges, for making scales for 
measuring various factors in addition to those of a purely cognitive 
type. It is often convenient to consider the curve as extending 
from — go to + go or even from — 2.5 ø to + 2.5 c only. The 
student will bear in mind the nature of the small errors so 


introduced. 


CHAPTER VI 


MARKING AND ITS PROBLEMS 


and colleges, lists of marks which have been produced in an 

arbitrary and entirely unscientific manner are thought to have 
an absolute value which bears no relation to the means by which 
they are obtained. For weal or woe no small part of the work of 
many teachers is the production of mark lists and the compound- 
ing of marks. It is well to give a little thought to the foundations 
of our beliefs concerning marks, particularly when these have 
been regarded as sacrosanct and as a type of numerical label by 
which one individual differs from another. A moment's thought 
will serve to show the limitations of certain marking systems. It 
would be a bold man who in marking two essays would give 
thirteen marks out of twenty to one and fourteen to another and 
be certain that the second was 5% better than the first! It would 
be a still bolder man who insisted that he was sure, in an English 
examination of the old type, that a candidate with 96 marks out 
of 100 was 1% better than another with.95 marks. 

We can begin by summarizing the chief uses of marking 
systems: 


[: is both amusing and disturbing to think that in many schools 


1, To obtain an order of merit list 


This is the popular use of marks in the schools. In order that 
there shall be a good spread it is necessary to devise a test which 
will give a normal distribution of the marks, or something 
approaching it. If two pupils have the same mark they will 
occupy the same place and the next pupil in order of merit will 
have the next but one place. If the mark list in order of merit is 
to be used for correlation purposes either by Spearman’s method 
of ranks or by the ‘footrule’ it is wise to consider more carefully | 
these ‘tied’ places. 
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e.g. The following is a portion of an old school mark list. 
Mark Position in rank 
Thompson 92 1 
Allen 84 2 
Walker 81 3= 
Smith 8r Cy 
Brown 81 3= 
Jones 79 6 
"Turner 76 7 


In this case it is better to credit Walker, Smith and Brown with 
the average place, i.e. the fourth place. In the same way, suppose 
two boys ‘tie? in the mark which comes after the roth place. 
Instead of putting two 11th places between the roth and the 1 gth 
places it is wise to credit the two boys with equal marks with 
“ur places each. If correlation is to be performed this is 
particularly important. 


2. To separate candidates who reach a certain level from those who do not 


Most of the public examinations, such as those for school 
certificates, matriculation, degrees and diplomas, have this end in 
view. At first sight this may seem easy, but it is beset with pitfalls. 
It is unwise to draw our lines of demarcation on the frequency 
curves at points where the curve is at its highest, for here there is 
less chance of a critical separation of one class of candidates from 
another. The standards of examination papers and of students 
taking the examination vary from year to year. It is difficult 
or impossible for an examiner who has set an examination paper 
to know what standard it is by just looking at it. Only experiment 
with many trials will show, and this is not usually possible. 
Examiners are changed from year to year or after a short period 
of years. Many examining bodies ‘standardize’ the marks, by 
approximating the percentages of credits, passes, failures and even 
distinctions respectively from year to year. It follows that in a 
year when many good candidates present themselves it is much 
more difficult to pass the examination than when there are more 
weaker candidates. 
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3. Tests and examinations may be set by a teacher to test the value af his 
own work or to estimate the progress already made by a class 


This should help the teacher to find what is difficult and what 
is easy to the pupils in his own teaching, and he can amend his 


work accordingly. 
4. Examinations should also look forward 


and not only backward on the pupil's past work. In other words, 

examinations should be prognostic. How far they have this 
quality has been the subject of considerable investigation, If the 
boy or girl at eleven has reached a certain standard in Arithmetic - 
and English is he or she a fit candidate for a place in a grammar A, 
school? Entry to the old universities may be secured with scholar- 
ships if a candidate shows sufficient knowledge of and ability in 
Mathematics. Is this a sufficient guarantee of a satisfactory à 
university and subsequent carcer?* [ 

Examinations are not as reliable as they ought to be for some — 
or all of the following reasons: 

(1) The number of questions of the older or essay type which | 
the candidate is able to answer in the allotted time is so small - 
that there is insufficient sampling of the candidate's knowledge. — 
Questions of ‘luck’ or ‘chance’ figure too largely in the result, 
from the candidate's point of view. 

(2) Candidates may differ in mental and physical condition 
from day to day and this will affect performance in the examina- 
tion. Vitamin intake, digestion, hours of sleep, mild infection, 
other physical and emotional states, the time of day, atmospheric 
and other environmental conditions and the total length of the 
examination may modify the student's work in it, or in some 
part of it. 

(3) Particularly in the ‘Arts’ subjects there may arise differ- — 
ences of opinion between one examiner and another concerning 
the value of a student's work. . 

(4) Examiners are not always consistent with one another in 
their standards of marking. Nor will the same examiner adhere 


1 An excellent short examination of examinations is given in Chapter XI of 
P. E. Vernon's The Measurement of Abilities. 
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to the same standard at different times of the same day, at 
different parts of the week and at different stages in marking a 
large batch of examination papers. 

The compounding of marks is a still more difficult task. Here 
the idiosyncrasies of a number of markers in different subjects 
will produce anomalies in the final result which are both unfair 
and misleading. As so much is often made to depend on the sum 
total of a candidate's achievement in an ‘omnibus’ examination, 
it is the duty of all concerned in the matter to investigate carefully 
what really lies behind the masses of figures which are produced 
from the several subjects examination, 

In a public examination, such as the Intermediate Examina- 
tions of the University of London, it may be possible to give equal 
weight to each of the subjects which are taken; but in a school 
annual examination this is not possible, nor is it for the marks 
which are given on each term's work, It is obvious that the 
maximum marks in English should be greater than those for 
Geography, just as those in Mathematics will usually be greater 
than those in Chemistry. The reason îs the obvious one that more 
hours per week are devoted to English than to Geogra y, to 
Mathematics than to Chemistry. (We will leave the em of 
relative importance from other points of view, though few would 
contest the superior position of English in the school curriculum.) 

A reasonable way of treating the marks of the respective 
subjects before compounding them would be to arrange each 
maximum mark so that it is proportional to the time devoted to 
the particular subject each week.’ 

Suppose 5 hours are spent on English, 4 hours on Mathematics, 
3 hours on Science and 2 hours on History, We might allow a 
term's maximum of 200 marks for English, 160 for Mathematics, 
120 for Science and 8o for History. It may happen that the total 
for all subjects will come to some large number which is not a 
multiple of a hundred. Whatever the total maximum, that is, © 
the total of the maxima of all the subjects, an order of merit can ” 


* It must be admitted the ces + practical subjects on which much 
time is s t carry few or no maria. whole matter is a difficult one and jt în 
impossible to arrive at a solution which will everybody. 
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be found just as easily, and if a percentage is required of the 
maximum score this can subsequently be found by simple reduction. 

There is usually a more serious difficulty in the compounding 
of marks. Some markers feel that a normal distribution of marks 
tends to depress and discourage all but the top quartile division 
of the candidates, whilst others feel that they may force their 
students to strive for better ultimate examination results by 
marking stiffly the work and tests of the term. Again, others find 
marking so difficult that they are only able to separate from the 
mass of papers the very poor candidates and the very good ones, 
and all others are bunched together with very little spread. or 
dispersion of marking and a rather high average usually of about 
5594. This makes the compounding of marks difficult. We can 
do something to adjust the various marking scales which will ` 
improve matters somewhat. Each mark may be regarded as a 
positive or negative deviation from the mean which is called 0, 
or the marks may be standardized by dividing these deviations 
by the standard deviation. All this would involve much labour 
which would certainly not be welcome and might not be possible 
at the end of term. The marks might be improved for the pur- 
poses of compounding by adjusting the marks in the interquartile 
range by means of a graph. 

Another useful expedient is to adjust the marks by means of 
a straight-line graph so that the top boy gets the maximum marks 
and the bottom boy no marks. (The objection to this is that the 
top boy may not be worthy of the maximum marks just as the 
bottom boy will probably deserve something better than zero 
marks.) All the objections in theory are met, however, by the 
very practical result that the resulting order of merit is much 
fairer to all concerned. We have said enough to show that no 
system of marks is entirely above criticism, and if we keep in 
mind the difficulties of marking and compounding our marks our 
system will progressively improve. 

Most teachers soon evolve a personal system of marking, and 
it is well for all who have to mark the work of pupils and students 
to explore the fundamentals of their own ideas on the subject. 
It is more difficult to mark papers of the essay type than those of 
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the new style where there are many shorter questions, which 
usually only require a sentence in answer to each, or even the 
choosing of a correct word or sentence from a number which are 
given for each question. It is obviously more difficult to mark 
an English essay where style is taken into consideration than an 
Arithmetic paper where a marking scheme can be followed fairly 
closely. Where marks are deducted for errors, markers should see 
that the total reduction bears a reasonable relationship to the 
marks credited for correct work. Until much practice has been 
obtained in marking and the marker has subjected his work to 
careful examination, it will be inevitable that a careful re-marking 
of a batch of papers, after a first assessment, will be desirable. 
This will enable the earlier papers in a batch to be adjusted to 
those which have come later and have been marked in ‘a state of 
maturity’ for that particular examination. Some conscientious 
examiners arrange the papers in order of merit as shown by their 
marking and then re-read them in descending order of merit, 
satisfying themselves that each paper is a little less worthy than 
the one which preceded it. If the examination and the candidates 
have been fairly matched the marks should be distributed in a 
normal manner or in an approximation to it. In the case of fairly 
homogeneous small groups (e.g. the mathematical ‘sets’ of a large 
fifth form) it is difficult to obtain the requisite distribution of the 
marking. It is obvious that the larger and more heterogeneous is 
the group the easier will it be to obtain normal distribution. It 
may be allowable in a scholarship examination when only a very 
few of the finest candidates can obtain awards to permit a slight 
positive skew to the distribution and thus give a better spread in 
the upper reaches of the marking. In the same way it may be 
permissible to allow a little negative skewing if the intention of the 
examination is merely to reject a few candidates who fail to secure 
a minimum of marks less than 40% or 50%, but the fact remains 
that for general purposes normal distribution should be aimed at 
and the marks which separate one class or degree of merit from 
another should not coincide with the mode (which in the case of 
normal distribution would also equal the mean and the median). 

A simple problem in connection with marking is the reduction 


H 
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of marks. The marks have been given to one maximum mark 
and it is desired to reduce or translate them to another scale with 
a different maximum. It is presumed that it is not desired to 
interfere with, or endeavour to modify in any way, the relative 
distribution of the marks which would be best achieved by 
drawing a curved-line graph. 

The simple task of ‘reducing marks’ is best effected by one of 
three ways: 

(1) Using a slide-rule. 
(2) Drawing a straight-line graph. 
(3) Multiplication of the marks by an easy fraction. 

1. Using a slide-rule.: This simple instrument permits multi- 
plication and division sums to be performed by adding or sub- 
tracting lengths of a ruler. As the standard engineer's slide-rule 
permits the use of various functions and is a more complicated 
instrument than we require for the simple reduction of marks, 
some schools possess a large slide-rule which is graduated for 
multiplication and division only. Suppose we have marked to a 
maximum of 120 marks and we wish to reduce these marks to 
a maximum of 100, that is, to express them as a percentage of the 
maximum. We take the slide-rule and move the lower scale (B) 
so that the graduation 12 on it corresponds with ro on the upper 
scale (A). The given mark is found on scale B and the reduced 
mark is read opposite to this on scale A. 

2. A ‘ready-reckoner’ table can be made in convenient form by 
drawing a straight-line graph. It is best to use graph paper where 
each large division contains ten (and not five) small divisions for 
this will facilitate reading the graph. To take the case given 
above. A point on the graph paper, on which axes have been 
drawn horizontally at the bottom of the paper and vertically on 
the left side, is found which corresponds to the maxima in the 
given and on the reduced scale. This will be the point with x value 
120 and y value 100. The point 12.10 (counting in large squares) 
is found and joined to the point o (the point of intersection of the 
axes) and the resulting straight line is the graph required. It is 
only necessary to find the corresponding y value on it when an x 

! See Appendix II. 
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value (that is, a mark on the 120 maximum scale) is read off 
horizontally. 


3. Many simple reductions can be performed by rapid mental 
arithmetic, Reductions to a half or a tenth or by two-thirds, a 
fifth and $o on would give no trouble. A reduction which fre- 
quently occurs is from 25 to 10 as maxima. This is equivalent to ` 
dividing by § which is equivalent to multiplying by 5. Thus we 
multiply each mark on the 25 scale by 4 and divide it by 10 by 
shifting the decimal point one place to the left. The reduction 
from a maximum of 120 to one of 100 is equivalent to multiplying 
by the fraction § or 19. 

Most people could achieve this very quickly by adding a nought 
to each mark on the 120 scale to multiply it by 10 and then 
dividing each number by 12. Some conscientious teachers who 
find difficulty in handling figures obtain their reductions by one 
method and check them with another.’ 

The importance of the transfer examination which is now taken 
by all children in state-controlled schools at the end of their 
primary school life has become greater, not less} since the passing 
of the Education Act of 1944. In view of the fact that the whole 
subsequent life and career of a child may be modified by the type 
of secondary education which he receives, it is hardly necessary to 
say that anything which can be done to improve the transfer 
examination, which is taken at about the age of eleven, should be 
regarded as a matter of prime importance. We should look upon 
the test as one which should have a prognostic value. Although 
statistical analysis in these matters is probably of less importance 
than the sound framing of the test papers, it is only by mathema- 
tical investigation that we can be assured that we are on the right 
lines in our examination methods. Much yet remains to be done, 
but all honour should be given to Professor Godfrey Thomson, 


1 Many markers find that they obtain à better spread when they mark from o to 10 
(ie. an 11-point scale). There are psychological reasons for this as they feel more 
sure of themselves with fewer choices of marks. In marking advanced work a 
“point scale A. B. C. D. E. is often used and this is extended in a non-mathematical 
manner to include -- and — marks to the letters. C should be an average mark 
and A should occur quite rarely. In practice E is rarely or never given and this 
automatically makes a 4-point scale or skews the distribution negatively. 
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who has devoted many years of his life to these problems and 
with his staff has evolved the Moray House Tests. It is obvious 
that the standard of the tests should be maintained from year to 
year and that the tests should aim at determining the type of 
secondary education which will best fit a particular child rather 
than testing the attainment and factual content of the child. 
Accordingly, tests in English, Arithmetic, and ‘Intelligence’ 
which seek to explore the native capacity (often called intelli- 
gence) of the child, are prepared with this end in view and are 
standardized by exhaustive experimental tests. It is easy to 
imagine that ideal tests for children of 10-11 cannot be evolved 
by fan armchair process, but only painstaking trial and error 
and careful analysis of the results will suffice. Even so, no ideal 
tests have yet been found, and there is still at least 10%, error in 
the prognostic value of most transfer tests. Nor is the underlying 
psychological theory a matter on which there is complete agree- 
ment between eminent authorities. It is believed that the average 
verbal ability of girls at the transfer age is somewhat greater than 
thatofboys. Dr. W. P. Alexander has stressed repeatedly and with 
justification the necessity of allowing for non-verbal abilities in 
transfer examinations, and he would divide abilities by means of 
oblique factors! into verbal and non-verbal types. Enough has 
been said to show that the serious student interested in the transfer 
examination will find much data which can be explored by 
statistical methods and will yield useful results. These must still 
be regarded as being valuable even when they only serve to show 
us the weaknesses of our methods and do not always offer any 
ideas for their improvement. 

In connection with transfer examinations and attainment tests 
an important matter susceptible to statistical treatment is the age 
allowance in marking schemes. 

Some education authorities permit only a single attempt at a 
transfer examination, and there is thus an age range of a year. 
Allowance is made for differences of not less than a month. Other 
authorities have an age range of two years or even more and 

„ permit two attempts at the examination if necessary. In fixing 


1 See page 182. 
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this allowance it îs wise to make experiments with large numbers 
of children of various age groups and to use general papers 
containing tests of ‘intelligence’, English and Arithmetic rather 
than papers of more limited scope. We could set a series of papers 
to children in age groups of 12, 11 and 10 respectively, and find 
the median score for each paper (or set of papers) and for each 
group. The median score or norm would show an increase when 
using the same paper from year to year. By drawing a graph for 
each paper (or set of papers), using the three points of the 12, 11 
and 10 norms, we find that we can find a straight line which 
practically goes through the three points in each case. If we use 
the graph to call the 12-year norm 100, we can read off the 
11-year and 10-year norms on this scale. The graphs obtained 
from the median scores of the other sets of papers will have 
different slopes, but when the 12-year median score is called 100 
and the other norms multiplied by the same fraction or read off 
on the graph we shall probably find that the other norms differ 
a little for the same age group. The average is then taken. 

Suppose that the difference averages about 2495 per year. At 
first sight it may appear that 2% should be added to the marks 
of the candidates for every month of his age below 12 years. This 
would probably be unfair as 2% of a lower mark is obviously less 
than that of a higher. To overcome this several methods are 
employed. We can take the age of the pupil with the greatest 
number of marks and reckoning two marks per month as an age- 
allowance scale up his marks to those which would be expected 
if he were 12 by means of a graph or a slide-rule. The 
corrections work out as follows: 


Age Per cent Age Per cent 
12.0 100 1145 86 
ILII 98 II4 84 
11-10 96 11-3 82 
11-9 94 11-2 80 
11.8 92 IIl 78 
11-7 9o 11-0 76 
11.6 88 etc. etc. 
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Thus we should find the percentage corresponding to the age of 
the pupil and multiply his marks by a fraction with this percentage 
in the denominator and roo in the numerator. e.g. Suppose a boy 
of 11 years 4 months obtains a total of 362 marks. His expected 
achievement at the age of 12 years precisely would give 

100 

362 X —— 

84 
or 431 marks. 

The matter may be regarded from another angle: we have 
obtained norms for each age group and by interpolation we can 
obtain norms for each month. Every pupil's marks will corre- 
spond with a particular age norm and therefore we could give 
an assessment of the achievement of each pupil in terms of his 
test or examination age, that is, the number of months above or 
below average as an equivalent of a greater or lesser ability than 
the normal for his age. 


30 40 50 60 70 80 


SCORES IN TEST OR TESTS 


Fig. 17. Percentile curves for four three-month groups. XY represents an age 
allowance for 9 months at a particular percentile level. This level must then be 
interpreted in terms of the scores of the whole of the candidates from a separate 
curve. For convenience percentiles have been reckoned from the highest score. 
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In transfer examinations some authorities, following a method 
similar to that which we have outlined above, have a table of 
percentages of marks which are added to the total scores of the 
children according to their ages. A cruder method which is 
employed by others is to have a table of marks and add an 
appropriate number to those of a child in regard to his age but 
without regard to his achievement. Strictly speaking, the percen- 
tage or proportional way of making the increase is the only 
equitable way, for the method of adding fixed numbers of marks 
according to age benefits the weaker children at the expense of 
the more able. 

The best method for ordinary use and one which does not 
evolve a great deal of labour is that due to Thomson.* The total 
marks (or those in separate subjects) for every child are divided 
into four age groups II.O years to II.2 years inclusive, 11.3 to 
11.5 years, 11.6 to 11.8 years, 11.9 to 11-11 years. Cumulative 
frequency (percentile) curves are drawn for the marks in each 
group. The abscissae differences between the first and the fourth 
curves give the differences in marks corresponding to a 9 months 
age difference. It will be noted that this difference is one of 
9 months and not of 12 months as each curve is for the average 
age of the three-month age group, that is, the first curve is centred 
on an age of 11 years I$ months and the last on 11 years 
104 months. It is now necessary to interpret these in terms of the 
percentiles and marks of the whole 11-year group taken together. 
Usually no child under 11 is given more than the allowance 
for 11.0 years. The mark difference for 9 months is divided by 9 
to give the monthly adjustment for each score level. Equivalent 
marks are subtracted for children from 12.0 years to 12-11 years. 

'There remains the question of the ideal mark scale and the 
mark value of each question in a given test. These matters can 
best be understood by further reference to our curve of normal 
distribution.» It will be seen that if we draw vertical lines at 
distances of 30 on each side of the central point the area enclosed 
by these lines and the curve is practically the whole of its area. 
Now, the area of the curve gives the frequency or the number of 


1 See The British Journal of Educational Psychology, 1936: 
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cases or scores, and only -2% of the scores lie beyond the go lines 
at the left and right extremes of the curve. (This will be clear 
from our short chapter on the normal curve.) If instead of 
drawing our vertical lines at points go from the centre we choose 
points at a distance $o on each side of this point, the area of the 
curve thus enclosed is 98-76% of the whole, that is to say, we have 
omitted only 1-24% of the whole scores. Although we have made 
slight sacrifices to accuracy it is very convenient to have a base 
of 50 instead of 6o because we can more readily divide it into a 
ten- or a hundred-part scale, and for our purpose here this 
arrangement is quite accurate enough. 
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Fig. 18. 


Suppose now that we divide it into 10 equal divisions along its 
base, and further let us imagine that in a test we have this number 
of properly graded questions, so that on drawing a graph showing 
the number of persons solving each question we get à distribution 
curve of the normal type. 

The scale of ability is taken to be similar to that of the scale of 
difficulty of the questions. Now area ‘a’ is equivalent to the 
number of those who cannot solve Question 1. Similarly area 
‘ab’ represents the number of those who cannot solve Question 2. 
‘abc’ those who cannot solve Question 3, and so on. Obviously 
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the mark value of a question should increase with the proportion 
of people who fail to solve it. For instance, by consulting the 
tables giving the proportions of curves of normal distribution 
which are cut off by ordinates at particular distances from the 
central point, we can find that the area abcdefg is approximately 
85% of the area of the whole curve. Hence Question 7 would be 
too hard for 85% of the candidates but it could be solved by the 
remaining 15%, (assuming that the time factor did not enter). 

Thus if a question is solved by 15% of the candidates it will be 
of difficulty 7 and take this number of marks. 

We can take the matter a step forward by drawing a percentile 
curve showing the percentages of candidates failing to solve each 
problem according to its difficulty and the marks which will be 
given to it. 

The student will find the construction of such a curve and the 
following tables an easy exercise in the use of the normal distribu- 
tion or probability-integral tables: 


Marks per % able to % failing to 
question solve it solve it 
I 98.35 1-65 
2 94 6 
3 85 15 
4 70 30 
5 50 50 
6 30 70 
7i 15 85 
8 6 94 
9 2 98 
10 almost o almost 100 


In order not to break too much with time-honoured custom 
and yet maintain a system which permits a mathematically 
reliable compounding of marks, some authorities regard g0% as 
the highest mark and 30% as the lowest in all but exceptional 
cases. Only one candidate in several hundred or even a thousand 
is regarded as being so excellent that he achieves more than 90% 


1 See page 91. 
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or so feeble that he scores less than 30%. This method, being used 
by schoolmasters and in certain of the public university examina- 
tions, obviously implies a certain degree of homogeneity resulting 
from the selection of the more able individuals from the population 
at large. 

A reasonable dispersion would be given by a standard deviation 
of 10 and, assuming a normal distribution, a median of 60. In this 
case the percentages of candidates expected to achieve scores in 
various mark groups would be as follows: (The extreme upper and 
lower reaches of the marking are reserved for candidates of rare 
brilliance or poverty of achievement.) 


Mark % % in each group 
92-88 up to 3 % 
87-83 1 
82-78 3 
77-78 6} 
72-68 12 
67-63 17 
62-58 20 
57-53 17 
52-48 12 
47-43 64 
42-38 3 
37:83 E 
32-28 up to} 75 


In practice, things do not work out quite as easily as this. Marks 
have to be allowed in many cases for answers which are partly 
correct and in many tests a choice of questions has to be per- 
mitted. In the ‘new-type’ examinations the number of questions 
would be much larger than in the old type and answers would be 
right or wrong, for the most part. Also, in view of the larger number 
of questions, proper sampling of the candidates can be achieved 
and there is no need to permit selection on the part of candidates. 
Nevertheless, in any type of examination a proper order of merit 
will only be secured by a proper grading of questions in difficulty, 
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with a weighting of marks in accordance with the requirements of 
the curve of normal distribution. It is not pretended that practical 
achievement in examining can match up to theoretical ideal 
demands but a more careful mathematical analysis of each test 
will go far to improve a system of examinations which has not yet 
been replaced as a means of assessing ability and achievement. 

In a work well known to the point of notoriety Hartog and 
Rhodes produced evidence to show the unreliability of examina- 
tion. No doubt An examination of examinations was intended to 
make our flesh creep, and to sustain their thesis the authors chose 
cases which did all they could to show the subjectivism of marking 
in the worst possible light. Most of the sets of scripts which were 
used for their experiments were more homogeneous than we should 
ordinarily find. Such sets of papers always present difficulties and 
it is well known that to secure a distribution which approaches a 
normal one we must use a large and heterogeneous group. Never- 
theless, the work of these authors did much to bring a realization 
of the need for more care in examinations no matter at what level. 

On the other hand, the value of examinations and the care and 
thought with which they are conducted has been finely expressed 
by Brereton in The Case for Examinations. It is a step forward 
if only average marks and standard deviations or interquartile 
ranges are equalized between one examiner and another or 
between one subject and another before marks are compounded. 
"There is an increasing awareness of the necessity of this, and that a 
failure to do so will lead to erroneous and anomalous results in 
final order of merit lists. 

It must not be assumed that the new type of test is in all ways 
superior to the old, or that it is free from defect. Vernon in The 
Measurement of Abilities has given an excellent analysis of this 
matter. Much more time, skill and experience are necessary for 
the production of the new type test-paper containing many 
graded questions, but time is saved in marking the scripts. Unless 
the number of scripts exceeds 300 no time is saved on the aggre- 
gate of setting the papers and marking the scripts. The examiner 
must decide just which type of question suits his purpose for the 
subject matter in hand. The questions may be divided into the 
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following types: (a) Simple recall and ‘open-completion’, where 
blank spaces in the question have to be filled in. (b) True-false 
where there is a set of statements some of which are true and some 
false. The candidate has to indicate ‘which is which’. (c) The 
Multiple-choice type, including best reason and matching items. 
In each case a number of alternative answers are given. One is 
correct and this is to be underlined by the candidate. (d) Re- 
arrangement type. Here a list of items which should fall into a 
unique order is given in the wrong order. The candidate must 
rearrange them to give the correct order. 

In the new-type tests a certain number of correct answers in the 
recognition-type of test may be obtained by chance guessing. This 
only means that the zero level in scoring is equivalent to a score 
which could be calculated as being the percentage of marks which 
might have been obtained by pure chance. The marks obtained 
may be corrected for guessing by using the formula, True score 


=R : W - where R is the total number right and W the total 
"mS 


wrong and n the number of alternative answers provided for each 
question, It has been shown that the above correction only makes 
_ appropriate compensations for the average candidate, On the 
whole the effect of guessing is much less than the layman would 


imagine. 5 


Mental Ages and Intelligence Quotients 

The Mental Age (M.A.) of a child as given by an intelligence 
test. Its Educational Age (E.A.) as given by educational tests is 
equal to the actual or Chronological Age (C.A.) of an average 
child with the same test scores. Intelligence Quotient is given by 


! The system of marking at most musical festivals and competitions seems to be 
extraordinary. Even very poor efforts are not ae eg He upwards of 75% 
and the majority of candidates obtain more than 8575. is is obviously inten 
to hearten all candidates and to maintain enthusiasm for subsequent occasions. 
Nevertheless, the adjudicator’s task is rendered difficult by this system, and his final 
marks are perforce given by reference to an order of merit resulting from a quick 
consideration of the qualities which make one competitor or group slightly better 
than another. The adjudicator needs good experience, judgment and memory. 
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i and is often expressed as a e. At first 
Chronological Age P a 

sight these may seem to be a much simpler and more straightfor- 
ward method of describing attainments or abilities than the use 
of percentile levels. There are some difficulties, however. To 
start with, the growth of intelligence and educational abilities are 
not regular year by year. The upper limits of achievement vary 
from child to child. After the age of eleven the intelligence-test 


achievement or intelligence tends to increase with increasing age. 
The fractions MA (i.e. LQ.) ande (i.e. E.Q.) keep reasonably 


constant for a number of years, 

"There is nothing absolute about a scale of intelligence ‘norms’, 
or the marking scale of an intelligence test. Unless all intelligence 
tests (in addition to all the other desiderata) are standardized as 

mean or average and standard deviation, statements of 
LQ, measurements will be ambiguous. We can only say ‘the L.Q. 
of as measured by this or that particular test is x’. The 
Moray House Tests yield an average score of 100 and an S.D. of 15. 
The Stanford Binet tests were formerly believed to yield an S.D. 
of 15 but this is now known to be 16}. In fact the S.D.s of intelli- 
test scores vary from 12 to 25 (with a mean score of 109). 
matter can only be made accurate by expressing differences 

in achievement in standard deviation units (see page 30).' 

We have left until last a short statement of the chief difficulty, 
and one which is perhaps not apparent at first. It is that of estab- 
lishing age norms, It is practically impossible to take a sufficiently 
large sample which will represent all posible children of any age 
group. In primary school life it is perhaps possible if we cast a 
wide net to find groups which give us a fair sample of the total 
population, but even here it is difficult to allow for the children 
(cither bright or dull) who attend private schools or those who 


* This section should be 
e followed up with Chapter X of Vameon's The Mearure- 


-— 


— 
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go to special schools, After the age of 11, with the children in 
various types of secondary schools the problem becomes even more 
difficult. There is still room in the field of simple research by 
teachers for experiments using intelligence tests with children of 
various ages, physical types, ‘social’ positions, localities, Although 
many hundreds of thousands of such tests have been given there 
is still no shortage of opportunities for their use. In rare cases it 
has been possible to test all the children of a certain age or from 
a certain locality but more often the best that can be done is to 
select them from as many schools as possible in different districts 
to give as wide a range of social and economic differences as 


possible. 


To Standardize an Intelligence Test 

If we could give the intelligence test to very large numbers of 
children in year groups of 10, 11 and 12 (making sure that each 
group is truly representative of all children of that age), we could 
plot the three aver: as equally-spaced ordinates on a graph 
and join the points. would yield a straight line sloping up- 
wards and by interpolation we could read off the monthly norms, 


e ÀÀ— 

—— i ums o ai] 
..... . . . rcs ness 

qm ood a méme QU eaten 


———] — P MÀ MÀ 
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AGE IN YEARS AND MONTHS 
Pig. 19. “The line of best fit is found by the method of least squares. 
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It would be convenient to have each of the ordinates separated 
by 12 units of abscissae in order to facilitate these monthly inter- 
polations. This method would be open to many objections. The 
division into years is far too coarse and little attention is paid to 
finer differences in the 11-+ year which may be the most impor- 
tant from our point of view, particularly if we are interested in the 
transfer examinations at the end of primary school life. Moreover, 
errors of sampling and distribution cannot be corrected by this 
method of taking the three year groups. 

A much better method is that due to Thomson.! A “complete, 
numerous and uncreamed' year is tested. The year group is 
divided up into 12 monthly groups, which must be as large and 
heterogeneous as possible so that each shall be a good sample of 
that age group of the whole population. The average score in the 
test for each monthly age group is found and plotted as an ordinate 
on a graph with abscissae giving the monthly spacings. Owing to 
errors in sampling the twelve (or thirteen) plotted points will 
usually not lie on a straight line. The line of best fit has to be 
found. As usual this is done by the method of least squares, that 
is, the sum of the squares of the deviations of the ordinate points 
from the line must be made a minimum.: The straight line of best 
fit can be extended backwards to deal with the 104- age group 
and forwards for the 12-- group. A child's M.A. can therefore 
be read off on this line by reference to his score in the test. His 
I.Q. can be found by dividing by his chronological age. 

Intelligence tests may also be standardized by comparing scores 
achieved in them with those in established tests such as the Binet, 
using the same groups of children. 


1 See The British Journal of Educational Psychology, 1932, page 99. 

* The quantity Z(u?) where the v's are the deviations from s obtained when the 
twelve or thirteen points obtained from the scores are substituted in the equation 
of the straight line y = mx + c. The values of m and c which give this are found 
from the equations: 

Z (y) — mz(x)— nc = o 
= (xy) — m Z (x?) — c Z (x)= o0 
where x represents ages and y the scores. 


CHAPTER VII 
THE ‘FACTORS’ OF THE MIND 


By measuring we know what things are long and what 
short. The relations of all things may be thus determined 
and it is of the greatest importance to measure the motions of 
the mind. 

MENCIUS, ¢. 335 B.C. 


commenced a serious investigation into the nature of human 

abilities. ‘One of the most pernicious (of fallacies) was found to 
be the current usage of the word "intelligence" without any 
definite idea behind it. Another, that does even greater mischief 
in practice, was the irrepressible tendency to assume that terms like 
“attention”, “combination”, “analysis”, “range of association", 
“co-ordination of hand and eye" and so forth represent so many 
functional unities or behaviour units. Alongside of these two great 
impediments to the advance of science has been the pseudo- 
explanation of the tests'of a person's "intelligence" as measuring a 
“level”, “average” or “sample” of his abilities whereas really no 
measurement is conceivably possible. The works on educational 
psychology have persisted in telling us that the ‘faculty’ psychology 
is dead (which should be true) but there has been a tendency to 
resurrect it in terms of mental factors. 

Spearman investigated five “laws quantitatively: the laws of 
span, retentivity (inertia and dispositions), fatigue, conation 
and primordial potencies (including such influences as those of 
age, sex, heredity and health). It was in these investigations in 
which he attempted to put certain aspects of psychology on a 
scientific basis that he made great use of correlation coefficients 
between tests, and examined them by mathematical analysis. At 
first it was necessary to achieve a ‘Copernican revolution’ in point 
of view. Instead of postulating ‘an ill-defined mental entity the 
intelligence’, and then by ‘intelligence tests’ trying to obtain 


È the early years of this century Professor Charles Spearman 


1 C, Spearman, The Abilities of Man, pages 409-10, 
I 119 
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a value for this, he started with a perfectly defined quantitative 
value ‘g’ and then demonstrated what mental entity or entities 
this really characterizes. 

Spearman showed that the coefficients of correlation between 
tests tend to fall into *hierarchical' order and he further demon- 
strated that this was consistent with his "Two Factor' theory. 


An example will suffice to show how this works out: 


Suppose the correlation coefficients between a number of tests 
I. 2. 3. 4. 5. 6. are written down in rows and columns as follows:* 


I 2: 30s DONA 6 

I Tis fis | Tu fis | Tis 

2 Tis TIE fetes ST fk fü 

3 Tis Vos | Tos Yas | Tas 

4 Tis Tu Pau | Ta | Tan 
5 Tis | Yos Tas | Tes | | Yoyo | 
6 | | | 
| 


Tis | Tos Tae | Fas | Tsa 


The tests which give each correlation ratio are denoted by the 
subscripts of r. e.g. ra, is the correlation coefficient between tests 
- g and 4. The above arrangement of rows and columns is known 
asa Marrixand in research work on psychological tests the elemen- 
tary properties of such sets of numbers are of prime importance. 
Let us consider the matrix rewritten with numerical correlation 
coefficients: 


Test. I 2 3 4 5 6. 
I -48 +24. -54 +42 +30 
2 48 32 ‘72 56 40 
3 +24 +32 +36 28 -20 
4 +54. “72 +36 63 45 
5 42 56 28 63 35 
6 +30 +40 -20 -45 -35 
Total| 1.98 2.48 1-40 2.70 2.24 1-70 


1 The exact nature of the tests in this case is of secondary-importance. Examples 


would be: Analogies; Opposites; Resemblances; Understanding instructions; 
‘Completion’, 
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We have added up the coefficients in columns and now proceed 
to rearrange the matrix so that the totals of the columns are in 
descending order of magnitude thus: 


Test 4 iu les I 6 3 

4 72 63 "54 45 :36 
2 +72 56 48 +40 +32 
5 ‘63 56 42 35 28 
I “54 -48 "42 -30 +24, 
6 45 "40 35 +30 20 
3 +36 +32 +28 +24 +20 

Total| 2.70 2.48 2.24 1.98 1.70 1:40 


In this ideal case! the ‘hierarchical order’, as Professor Spearman 
called it, is easily seen. The correlation coefficients in any two 
columns have a constant ratio to one another. Consider the last 
two columns: 


“45 +36 
+40 +32 
35 28 
30 34 

+20 
+20 


Ignoring those coefficients which are not paired it is easily seen 
that there is a ratio of 5 : 4 between the left and right columns. 
In other words each coefficient on the right is $ of that on the left. 

This precise relationship would not be apparent in actual tests 
but the tendency would still be evident. Spearman explained this 
hierarchical order by a common factor ‘g’ which was present in 
each test but in the largest quantity in that at the head of the 
hierarchy. Each test also contains a specific factor which would 
not be found in any other test unless similar varieties of the same 
test had been used. A test is said to be ‘saturated’ or ‘loaded’ with 
g to an extent depending on its place in the hierarchy. Suppose 

1 Given by G. H. Tuomson, The Factorial Analysis of Human Ability. (The 


hypothetical coefficients have been chosen to demonstrate the principle in the 
easiest way.) 
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it were possible to devise a test of pure ‘g’, that is to say, one com- 
pletely saturated with ‘g’ and containing no specific or ‘s’ factor. 
Such a test would stand at the head of hierarchy. The self-correla- 
tions of the tests are ideally unity and in the diagonals of the 
matrices have been left blank. In the case of the self-correlation 
of pure ‘g’ it can be written in and this number (unity) will con- 
form to the hierarchy. In the other unities the ‘specifics’ enter and 
they are omitted as they do not conform to the rule of propor- 


tionality between the columns. 


We may now rewrite the matrix including ‘pure’ g: 


í pt x 


o a b c | d 
EX led e Ne | Tg | Tus 
a | Ta -72 | 63 | -54 
b Tog -72 | 56 | -48 
€ | Tae 63 56 | | +42 
d Tig +54. 48 42 | 
€ | te | 45 | -40 | 535 | +30 
f o7 36 | «32 | 28 | -24 


:20 


Togs Tus Togs Tags Yeg and 7, are the correlations or saturations of 
the tests a. b. c. d. e. f, with g. Let us examine the first two 
columns: 
I Tag 

Tag 

Tis +72 

Tog -63 

Tag -54 

Teg “45 

Tr :36 
Tetrad Differences 


We have already noted that in the hierarchical order the 
correlation coefficients in the columns of the matrix tend to be in 
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the same ratio. Let us take out any group of four coefficients from 
the matrix 


when :54 X 40 = 45 X +4 
or .54 X -40 — 45 X 48 — 0 
This is called a tetrad difference and this one is 
Tad X The — Ted X Tae = O* 


Thus, another way of putting Spearman’s discovery is that the 
tetrad differences tend to be zero. 

Spearman gives his tetrad equation in the form: 

Tap X fbg — Tag X Yop =O 

When this equation holds throughout any table of correlations, 
and only when it does, every individual measurement of every 
ability or any other variable contained in the table can be divided 
into two parts: “The one part has been called the general factor 
and denoted by the letter g; it is so named because, although 
varying freely from individual to individual, it remains the same 
for any one individual in respect of all the correlated abilities. 
The second part has been called the specific factor and denoted 
by the letter s. It not only varies from individual to individual, 
but even for any one individual from one ability to another.” 
(Spearman’s two-factor theorem is a piece of general mathematical 
analysis and is in no way confined to psychology.) 

As the scores in the tests have been standardized o = 1 and 
o: (variance) is also equal to 1. The sum of the variances due to 
each of the factors is equal to the test variance. Thus: 

(saturations with g)? + (saturations with s)* = 1. 
g? + s? = 1 (the ‘variance of the test’) 
communality + specificity = variance 

1 Those who have some knowledge of determinants will see in this a minor 


determinant solved by cross-multiplying. 
? The Abilities of Man, page 75- 
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A A B B' == 
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SJ 
— 
SE 
D Di [s c r 


The area of each oval E and F, and each rectangle ABCD and A!B!C!D! repre- 
sents the variance of an ability or test. The shaded overlap represents the co- 
variance which will equal the correlation coefficient if the areas of each of the 
rectangles and ovals can be taken as unity. Where this is not the-case the correlation 
is given by dividing the area of the overlap by the root of the product of the ovals. 


We can now express the tests in the form of equations containing 
gand s. 


e.g. Taking a saturation of g of .9. 
+9? +s?=1 
s.s = VI — 81 
= V-19 
= +436 


Hence if z is the score of a person in the test given by the suffix 
to z 


= 


Za = "98 + +436 sa 
Zo = +82 + -600 sp 
Ze = «78 + -714 Se 
Za = 6g + -800 sa 
£e = 5g + -866 se 
£f = 48 + 917 sj 


The six saturations with ‘g’ are therefore: 
-9 8 7 6 E 4 
and every correlation coefficient in the matrix can be seen to be 
the product of two of these saturations eg. 
+56 = .8 x “7 
Or Toe = Teg X Tog 
We have not yet actually shown how to find the g loadings from 


the matrix. If we are to fill in the blank spaces in the diagonals 
the entries will clearly be the respective g loadings multiplied by 
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themselves, i.e. squared, for each entry in the square is the product 
of two saturations or loadings. These squares of saturations are 
called communalities. Let us call this square x? in respect of test 
a and fit it into the tetrad formed by tests a and b and tests a and c. 


a € 
a | x? .63 
b | -72 +56 


If the two-factor theorem is true the tetrad difference is equal 
to zero 


Thus „56 x — -72 X 63 = 0 
Ni = Br 
X4, = 0 


Similarly all the other communalities may be found. But, if the 
two-factor theory is not the whole story and there are residual 
factors Thurstone found the communalities as we have done, 
inserted them in the columns and added up the columns. "These 
sums were added together and the square root found. The satura- 
tions of the first and only common factor are then given by 
dividing each of the column totals by the square root. 

It is not proposed to continue his analysis in this elementary 
work but it will suffice to say that whatever the numbers of factors 
are found the sum of the squares of their loadings or saturations 
(i.e. their variances) will give the test variance which will be 
unity, in view of the fact that the scores have been standardized. 

Here is an example:! 


The composition of a test may be given as 
“71g + 400 + -340 + +475 
g = Spearman’s factor, v = Stephenson’s verbal factor, 
n = a number factor and s = specific test factor. 


The sum of the squares of the saturations is practically unity: 
(-72)* + (40)* + (-g4)? + (47)? = 1.0006. 
We have already seen that the tetrad equation fıs 724 — 723 Tae 


1 From THomson, Factorial Analysis of Human Ability. 
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= 0 is really another way of writing the minor determinant 
which represents the intercorrelations of two tests with two others. 


3. 4 
I fis Tua 
2 Tas Tua 


The process can be extended and tetrad differences of tetrad 
differences can be found: 

Suppose we extend the tetrad (or a minor determinant of order 
two) to a nonad (or a minor determinant of order three). We 
could obtain this from the correlation coefficients of three tests 
I. 2. 3 with three others 4. 5. 6. 


[E 


3 Tas Tas Tse 
It is at once evident that this minor determinant of order three 
can be divided into four determinants of order two (or tetrads): 


Tis Pas — Tze Tis 
Tia Fas — Tas Tis 
Tis Fas — Yas Yis 
Tia Tas — Tas Yis 
This is done by taking the top left coefficient 7,, as the ‘pivot’. 
The four tetrad differences are themselves formed into a tetrad 
and this can be evaluated. This operation is known as pivotal 
condensation. It must be remembered that the result, if not zero, 
has to be divided by the product of all the pivots except the last. 
If we do not include the numbers in the diagonals which repre- 
sent the self-correlation of a test, we can reduce the minor de- 
terminants of orders two and upwards in the correlation matrix 
and it may happen that all the minors of a particular order vanish. 
The ‘rank’ of the matrix is equal to the order of its greatest non- 
vanishing matrix (in terms of its rows) and is one less than the 
orders of the minors which vanish. 


1 See TURNBULL and AITKEN, Theory of Canonical Matrices, or THOMSON, 
The Factorial Analysis of Human Abilities, Chapter VI. 
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Thurstone has shown that a set of tests can be analysed into a 
number of factors, common to each test, equal to the rank of their 
correlation matrix plus a specific factor for each test. The factor 
‘loadings’ or ‘saturations’ in each test can be determined by using 
the ‘centroid’ or ‘centre of gravity method. It is called the 
‘centroid’ method because Thurstone conceived it as a means of 
finding a centroid or centre-of-gravity in a geometrical model, 
As we have already seen it is easy to make a model which contains 
only three vectors (whether these are test-scores or factors) but 
4— or more — dimensional space, though it offers no particular 
difficulty to the mathematician, cannot be modelled in the ordinary 
‘Euclidean’ way. The geometry of ‘hyperspace’ is a logical ex- 
tension of that of three dimensions and it usually yields readily to 
analytical treatment. That is to say, instead of worrying about the 
difficulty or impossibility of making useful models we can find and 
develop the simple algebraic equivalent.* 

Spearman’s work has not gone unchallenged. Although it is 
true to say that the tetrad differences of Spearman’s hierarchies 
were either zero, or were normally distributed about zero, it must 
be confessed that there was a tendency to consider too few cases 
and perhaps to overlook tests which did not fit in with the 
hierarchy. 

Spearman and his school analysed the results of too few tests, 
and too readily assumed that all the tetrad differences were 
normally distributed about zero. Later, many tests were found 
which did not fit in with the two-factor theory, and group factors 
had to be adniitted. Thurstone of Chicago using a more extended 
analysis showed that the Spearman results were only a particular 
case of a larger generalization. It is beyond the scope of this 
introductory work to give a detailed account of Thurstone’s 
various methods. As in other cases they can be thought of in 
geometrical and in corresponding algebraical terms. For the pur- 
pose of explanation the former method is useful but it is the 


1'The student who is not able to work through THomson’s The Factorial Analysis 
of Human Ability or Burt’s The Factors of the Mind may obtain a simple accoun} of 
modern work in this field in THomson’s booklet Some Recent Work in Factorial 
Analysis and in Bunr's review of Thomson’s books in The British Journal of 
Educational Psychology, Vol. XVII, February 1947. 
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analytical processes (matrices and determinants) which are 
actually used for calculating the factors. 

Other workers have found group factors, such as a verbal factor 
v which is common to a group of tests but not to all. This could 
be represented like this: 


GROUP FACTORS WITH £ AND 5 SPEARMAN'S g AND $ 

General Group factors Specific | General | Specific 

Test | factor a b c factors Test | factor | factors 
A x x x A x x 
B x E] x B x x 
Cc x x x Cc x x 
D x x x D x x 
E x x E E x x 
F x x x F x x 
G x x x G x x 
H x | x x H x x 
I x x x I x x 


As we have already seen, the pioneer. work of Spearman 
described in The Abilities of Man with his g and s factors was 
limited. Doubtless, he was justified in drawing the conclusions 
which he arrived at from the mental tests which he applied and 
the analysis of his results. Nevertheless, further researches have 
shown the need for more factors and the need for group factors 
which are common to a limited number of test results. Some 
method of multiple-factor analysis had to be found to deal with 
group factors and to obviate the restriction of no correlation except 
through a factor common to all tests. 

It is beyond the scope of this work to deal with the methods of 
multiple-factor analysis. There is a considerable literature on the 
subject and the student would do well to start his study of the 
matter with Thomson’s excellent Factorial Analysis of Human Ability. 
Multiple-factor analysis has been developed by Sir Cyril Burt in 
England, and L. L. Thurstone and H. Hotelling in America. 

The most popular method at present in use is that due to Thur- 
stone, or some modification of it. At the time of writing this book 
the exact nature of the ‘factors of the mind’ is still a matter of 

-much discussion between psychologists. Even on the cognitive 
side of mental activity various claims are put forward by different 
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workers concerning the nature, number and importance of these 
factors. It is too early to decide whether they bear some relation 
to neurological qualities of the brain, whether they are mathe- 
matical artefacts, whether they are just convenient mathematical 
symbols or whether they represent fundamental quantities in 
human cognition. (Attempts to submit the affective and conative 
aspects of mental activity to factorial analysis are fraught with 
even greater difficulty. The factors suggested by various psycho- 
logists, which describe temperament and personality, are legion. 
Raymond Cattell has listed over 1,000 traits which he has gathered 
together and arranged in more than fifty ‘factors’. It is too early 
to see whither this will lead us. It will suffice for the student to 
know that there are well-marked personality traits, such as 
'ascendency-submission', which are tested by questions and 
marked according to a given scale). 

A fruitful way of regarding tests, their correlations and factors 
is to represent them as vectors or straight lines. Two lines may be 
drawn through a point to represent the tests and the correlation 
between them is numerically equal to the cosine of the angle made by 


X Y 


the two lines. The point of intersection of the lines represents 
a person who has made an average score on both tests and 


1 Various leading psychologists in Britain and America have different ways of 
regarding factors. Thomson, Allport and Anastasi maintain that factors are 
statistical artefacts without any reality or neurological counterpart. Burt regards 
them as principles of classification described by selective operators, whereas Spear- 
man originally thought of them as fundamental functions of the mind, Guilford 
calls them fundamental dimensions of the mind and the Americans Thurstone and 
Holzinger regard factors as primary or fundamental abilities. ‘The student need not ` 
be unduly worried about this. The atomic physicist is up against similar problems 
when he is considering such problems as the idea of the ‘reality’ of an electron. 
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other points on each line represent standardized scores in the 
tests the positive direction being shown by the arrows. The 
degree of correlation increases as the angle decreases and will be 
perfect positive ( +1) correlation when the lines coincide, there 
will be zero correlation when they are at right angles and nega- 
tive correlation when the angle becomes obtuse. Any point on 
the paper represents the scores of a person in each of the tests and 
each score is given by the perpendicular distance of the point 
from one of the lines. 

The idea of zero correlation when the lines are at right angles 
(cosine go” = o) is a useful one. Sometimes factors can be 
thought of as vectors which are at right angles. They are then 
wholly independent factors and have no common quantity or 
overlap. Instead of speaking of them as rectangular factors we 
use the Greek OrTHoconat to describe them. The factors for 
which Spearman sought would thus be spoken of as orthogonal. 
Oblique factors are those which could be represented by lines at 
an angle with one another which is less than a right angle. Most 
of the methods originated by Alexander, Thurstone and other 
recent workers use oblique factors. 

Let us represent two tests by the lines X* X and Y:Y meeting at 
O. Thecosine ofangle XOY —the correlation between the tests. A 
testee with average marks in both tests will be at the point O and 
other testees will be represented by swarms of dots, like bullet holes 
round a bull’s-eye O, with the density of dots per unit area be- 
coming smaller the further we go from O. Now the analysis 
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of test results is equivalent to referring these tests vectors to axes 
at right angles and these latter will represent orthogonal factors. 
Consider the simplest case of two factor vectors OA and OB 
respectively bisecting the angles between the test vectors. This 
was the idea with which Hotelling started his analysis. OA and 
OB would represent his ‘principal components’. There is no 
necessity, however, for OA and OB to be placed in the position 
we have taken. They could be placed anywhere provided that 
they passed through O and were at right angles (orthogonal). 
These factor vectors can be rotated to the most convenient 
position, indeed, if either OA or OB are made to coincide with 
either OX or OY one of the factors is given by one of the test 
vectors. 

When OA bisects the angle XOY, as it does in the case we have 
given, the scores along OA clearly give the best representation of 
the results of the two tests. Such a vector is known as the “first 
principal component’. (Hotelling.) 

In the case of a Spearman analysis of two tests three orthogonal 
factors would be necessary, that is, a common g and two separate 
s factors. Thus his factors may be represented by three straight 
lines at right angles meeting in a point like three edges of a 
rectangular box meeting at a corner. These three vectors (still 
remaining at right angles to one another) are rotated until one 
is at right angles to the first test and another is at right angles to 
the second test. Then, g is represented by the third vector. In 
general, Spearman’s ‘two-factor’ analysis requires one more 
dimension in space than the number of tests. Again, we have to 
use the geometry of ‘hyperspace’ and models are of only limited 
help. 

If we wish to add a third test to those which we have represented 
by the two lines through the point O on a plane surface we shall 
have to consider three-dimensional space. We shall find from 
trigonometrical tables angles whose cosines are the correlation 
coefficients of the third test and each of the other two respectively. 
We shall then find a line going through O which makes these 
angles with the first two vectors. Usually we shall obtain a kind 
of tripod with one of the vectors coming out of the plane of the 
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paper. If the sum or the difference of the angles which we have 
found is exactly equal to the angle between the two original test 
lines, the three lines will lie in the plane of the paper. Again, if 
any two angles together are less than a third angle it will be 
impossible to draw the third line. It will be ‘imaginary’ in the 
mathematical sense. More than three tests demand the use of 
multi-dimensional space and although this cannot be visualized, 
it is nevertheless a useful mathematical device for work with four 
or more tests. 


Note on Gorrelation Matrices and Lines of Regression 


Consider the following correlation matrix in which xo, Xi 
Xa... etc. are tests of certain aptitudes: 


Xo xy Xa Xs Xn 
Xo I Tos Tos Tos Yon 
Xi Tos I Tis Tis Tis 
Xa Toa Tia Ip fag Ton 
Xs Tos Tia Tas I Tan 
E | 
Xn Ton Tan Tan Tan I 


Each of the correlation coefficients r may also be considered as 

_ the regression of the score in one test on that of another. In other 

words, the estimated score in one ability or aptitude is expressed 

as à linear function of the scores in a number of others x, 4» 
Xs... Xs. The regression equation becomes: 


Xo = bixi + Baa + baxs... + bnăn 
where ba, 0s, by... bn are the regression coefficients. 


It is sometimes necessary to know how far estimates made from 
regression equations differ from the true values. 

This is given by the multiple correlation (Rm) between the 
estimates and the true values. 
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Now Rm = Jb itor + Datos + Datos +. - - Onton 
Those who have some knowledge of determinants will see that 
this may be expressed as 


A 


Doo 
where A is the complete correlation determinant (or matrix) 
given above and A,, is the minor determinant which is left when 
the first row and column are removed. 
Similarly we could use the second regression equation and find 
estimates of x when y is given and these errors of estimate would 
be distributed with a standard deviation: 


ox VL — riy 


Here we find again the alienation (k) where k = 4/1 


Rm = 


P 


We have alrcady seen (page 54) that if two arrays of scores 
x and y have in them a common factor c while the other elements 
are unique! 


oc 
thy = 
Ox Oy 
T Y Oc Oc æ 
Thus we may write r4 = — + — = % 9 f 
O. Os 
Oc Oc 
where a, = — and a = — 
O1 Og 


Suppose we have four tests 1, 2, 3, 4 in which there is a common 
element c and that the scores have been correlated in pairs giving 
the coefficients ris, 715 Tras ass Taas Ta 


1 The coefficient of correlation between two sets of measures is the proportion of 
the total variance which is due to the common factor in each test. 


where 9,2 is the variance due to the common factor and 0%, q the total variance. 
Note that variance is the square of the standard deviation and that variances may 
be added algebraically. 
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Thus: x 


Oc Oc 
Tig = — TH Xa Oa 
01 Os 
Oc Oc 
Tis EVI ist Qa 
Gy Oa 
" Os Cc 3 
fa = =e = Su Cu 
O4 Gi 
da ge 
Tas = —.— = 00s 
O: Os 
Oo Oc 
Tag = — + — = Hy 
Ga CE 
Ge Oc 
fa m — e, 70404 
93 94 


If the correlation coefficients are multiplied in pairs we get: 


= 04 Xa Ay Oy 
= 0 Og Oy Oe 


Qt Ay Oy Oy 


‘Thus 


Tia fas — Vis fu = 0 
Tia aa — Via Tog = O 
Tig Tas — Tia fas = O 
These are known as the tetrad differences. 
Spearman’ proved the converse of this, that is, if a common 
element c runs through each test the tetrad differences fıs fas — 


fıs Tax Will be zero. 
1(C. Spearman, The Abilities of Man, Appendix, pp. iii-vi.) 


A Note on Tetrad Relations 


Adapted from Praccio, Mathematical Gazette, Vol. XVII, No. 222. 
Suppose that we have k sets of numbers denoted briefly by A, B . . . and that 
these are expressible in terms of {k + 1) other sets G, $,, 8,,. .. no two of which 


are correlated and 2k constants mg, mp, . . . na, nb, . . . by equations such as: 


a = mag + na $a... (1) 
b = mbg + nbsp... (2) 


Each equation really denotes N equations as a can take any one of the values, 
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d145... Witha corresponding set of values for g and sa. But ma and na are con- 
stants which occur unchanged in each of the N equations. Taking the arithmetic 
mean of the N expressions of a given type (called averaging) gives us: 


averageofa =o 
average of a? oq" 
average of ab = Oa Ob rab 


If all the numbers have been reduced to standard measure (i.e., mean of numbers = 
o and o = 1) these averages reduce to o, 1 and rap respectively. 


From equations (1) and (2) we get 
ab = ma mp g? + ma np 9 Sb + mb nag Sa + Na Nb Sa Sh 
from which by averaging and noting that g and s are uncorrelated 
Tab = ma mb... (3) 
Similarly red = me md and so on. 
Hence rab red — Tac rbd = O 


. By permuting the letters a, b, c, d we get three such relations, but only two are 
independent. 


CHAPTER VIII 


THE NULL HYPOTHESIS, 
CHI-SQUARED AND CONTINGENCY 


hypotheses by having an intelligent regard for the data in 

hand and we wish to find whether the observed differences 
from our hypothetical law are likely to be due to chance errors of 
sampling and observation, or represent some real departure from 
or disagreement with our tentative ‘law’. A frequently recurring 
case is that of fitting a number of points, which have been plotted 
as the result of observations, to a curve whose shape and formula 
are well recognized. Few if any of the points may actually lie on 
the line and we need a method of showing whether their failure 
to do so is due to chance errors or whether the line and its formula 


posae in educational research as elsewhere we frame 


are not applicable in this case. Again, we may have a series of 


examination marks which at first sight seem to show that there is 
a significant difference in achievement in arithmetic of a particular 
age group between the sexes. We may start with the hypothesis 


that there is no difference and then find a statistical method of 


showing the probability that is so or otherwise. 

The null hypothesis is an exact statement, whereas if we begin 
by saying “the girls are better than the boys we have made a 
qualitative but not a quantitative statement and statistically it 
would be difficult or impossible to start from here. It must be 
remembered that if we disprove the null hypothesis we have not 
usually proved what is apparently the truth concerning a single 


. ! This method has been compared with that of British justice, where the prisoner 
is assumed to be innocent until he has been proved guilty. 
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rival hypothesis. There may be many factors and variables which 
have to be brought under control before this is done. For example, 
if we assume that there is no common element in two arrays of 
scores, i.e. that they are uncorrelated and we find that this null 
hypothesis is disproved we have no right to assume that a linear 
correlation exists between the two arrays. We need further 
information concerning their nature and distribution before this 
can be done. In proving or disproving a null hypothesis we must 
remember that we cannot do it absolutely but only to certain 
degrees of probability. There is no absolute measure of what is 
significant and what is not: we can only say, for example, that the 
chances are 40 to 1 that the null hypothesis is true, i.c. that any 
differences are due to chance errors of sampling and observa- 
tion. 

A 1% level of significance would mean that only 1 chance in 100 
would be against the acceptance of the hypothesis and the result 
would be highly significant, a 5%, level would mean 5 chances in 
100 or 1 in 20 would be against the hypothesis. Many workers 
would accept this, at any rate until further investigations could be 
made. Here again we must reiterate that statistics deals with 
varying degrees of probability and not with certainties. 

A simple example illustrating the use of the null hypothesis and 
which is typical of many simple investigations in psychology and 
education is an estimation of the probability that scores in a num- 
ber of items are significantly better than would arise by mere 
guessing. Suppose we ask ten questions which require only a 
positive or negative response for each, guessing would produce the 
correct answers on the average five times out of ten. We have to 
ask what significance is attached to 7 or 8 correct responses out 
of ten. By guessing the chances of right and wrong responses on 
each occasion are equal. 

If we expand (R + W):* by the binomial theorem we find the 
following coefficients for the various combinations of R and W in’ 
the 10 trials. 
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Probability 
ratio 


1 Rt 1 chance in 1024 all right 
10 R°W IO chances in 1024 of 9 right 1 wrong 
45 R*W* 45 chances in 1024 of 8 right 2 wrong 
120 R"W* 120 chances in 1024 of 7 right 3 wrong 
210 R*W* — 210 chances in 1024 of 6 right 4 wrong 
252 R*W* — 252 chances in 1024 of 5 right 5 wrong 


o10 R‘W* 210 chances in 1024 of 4 right 6 wrong 


Wo E 


120 R*W* 120 chances in 1024 of 3 right 7 wrong Yr 
45 R:W* 45 chances in 1024 of 2 right 8 wrong ihr 
10 RW* 10 chances in 1024 of 1 right 9 wrong zits 

TW 1 chance in 1024 all wrong Tori 


Total no. of chances = 1024. 


The probability of getting 7 right is 32%, which is not significant 
at the 5% level. 

The probability of getting 8 right is 442; which is almost signi- 
ficant at the 5% level. 

To get g right is significant at the 1% level and 10 right is 
significant at .1% (it is very highly significant).* 


One of the most useful methods of investigating the numerical 
results of educational research is the use of chi-squared x’. 
Pearson developed this at the beginning of the present 
century and in recent years it has become popular in attacking 
many problems requiring the analysis of variance. The most 
common and straightforward use of y? is that of testing the 
agreement between observed quantities and those expected in 
view of an apparently suitable hypothesis. For instance, we might 
wish to find whether a set of measures fit a normal distribution 


l'This simple work may be extended so that the probabilities are found by 
reference to the areas under parts of the normal curve. See also R. A. FISHER, The 
Design of Experiments (1935), chap. 2. 


CHI-SQUARED AND CONTINGENCY 139 


curve to such an extent that any discrepancies are due to errors 
of sampling and are not significant. 

If Fe is a number expected and x is the difference between this 
and the actual number observed F (i.e. the observed number 
F=F, +4) 


then = (=) 


It is obvious that in the case of perfect agreement between the 
observed and expected values x? will vanish and its value will be 
smaller in accordance with the closeness of agreement between 
the sets of values. Tables have been prepared which give a value 
for P, the proportion of cases in which any value of x: is exceeded. 
The tables give the relations between x? and P, the probability 
for various values of n, which must be an integer and represents 
the number of degrees of freedom or independent variates of the observed 
classes. In educational investigations there arise many cases 
where we might wish to find whether the differences between 
theoretical or predicted values and those actually observed were 
due to chance errors of sampling or whether the differences are 
significant. The chi-square method is also useful to test the 
*goodness of fit of a set of given values to those represented by a 
standard curve. For example, we know from tables the values of 
the ordinates of the normal probability curve at various sigma 
distances from the mid-point. We may be given a set of values to 
fit to the curve! and the ‘goodness of fit? may be estimated by x?*. 
Again, we may wish to compare teachers’ estimates of pupils’ 
work in classes (A. B. C. D etc.) with their subsequent achieve- 
ments in examinations. Again, we may wish to compare group- 
ings or estimates with respect to one factor, quality or attainment 
with those of another. Here we use a contingency table and from 
this we may obtain a value for the probability that the differences 
are not due to chance. x? does not normally measure correlation; 
it is really a measure of divergence rather than association. 

Example: The following table gives the theoretical frequencies 
fe and the observed frequencies f in fitting values to a normal 
curve at the given intervals. Find whether the fit is good and 


1 See page 96. 
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whether any deviations from normal distributions are due to 
chance fluctuations. 


fest 
2—> 
X fi 
'The table should be set out as follows: 
Frequencies — f)’ 
Interval |__| (f — fe) |(f—fe)* Yon 
gere 3 
280-340 17 15 2 4 :27 
260-280 13 15 —2 4 +27 
240-260 20 20 o o +00 
220-240 27 24 3 9 +38 
200-220 23| "95 rc 4 +16 
180-200 19 21 — 2 4 +19 
160-180 15 17 —2 4 +23 
100-160 23 20 3 9 “45 
Totals 157 157 o x! = 1-95 


Knowing seven of the observed frequencies and the total, we 
could find the eighth. Thus, there are (8 — 1) — 7 degrees of 
freedom. By consulting the Fisher or Elderton tables for 7 
degrees of freedom and yx? = 1.95 we find a probability value of 
P = .96. This means that even if the function were distributed 
normally throughout all its measures, as great a discrepancy as 
we have obtained would occur in samples 96 times in 100. The 
fit is in fact better than usual for the most probable value of P 
for a true fit is .50. [If the process were repeated for many samples 
with the same mean and standard deviation the number of 
degrees of freedom would be two less, i.e. 5. The value for P in 
this case would be .84.] 

It often happens that it is necessary to determine the degree of 
association between two sets of measures which are not normally 
distributed but are given in the form of numbers in each of a 
series of classes in both sets of measures. For instance, we may 
mark a set of Physics papers in four classes A. B. C. and D without 
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further distributions within each class. In the same way we may 
mark a set of Chemistry papers in four (or some other number of) 


classes of merit A. B. C. D. 
We wish to find whether there is a significant degree of associa- 


tion between the two sets. 
It is convenient to arrange the number of cases which fall into 


each group (the frequency in the group) in a cell in'a square or 
rectangle. 


PHYSICS 


D Cc B A | Add 


A I o 3 6 10 
a A 
E B 2 5 5 I 13 
SG 3 3 I 2 9 


Din ee ij Set O ra |. «8 


Add | 10 II 9 IO 40 
Total 40 


Here we have sixteen cells or categories and each one represents 
a group in Physics and one in Chemistry so that every possible 
case is covered. The number in each cell represents the number of 
students in each category, e.g. 6 students have A marks in Physics 
and in Chemistry, 3 have a D mark in Chemistry and a C mark in 
Physics. If there were no correlation between the sets of marks 
we might expect the 10 students with A.s in Chemistry to be 
distributed in the proportion 10. 11. 9. 10 in their Physics groups, 
that is to say, about equal numbers in each group. 

Suppose now that there were no relationship between the 
groups in Chemistry and those in Physics. Let us calculate how 
many students would fall into each of the 16 cells in this case. 
(Fe is the expected frequency.) 
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Fe for Ain Chemistry and D in Physics = zs te? 


Fe for 2h im Ohemistry and C in Physica = i 


F, for A in Chemistry and B in Physics & 199 
40 and soon. 


* Now make a 4 x 4 table of these Fe.s: 
D Cc B A 


2.50 | 2:75 | 2:25 | 2:50 


3:25 | 8:57 | 2:92 || 3:25 


2.25 | 2-47 | 2-02 | 2-25 


je iet pata 


2.00 | 2.20 | 1:80 | 2-00 


TABLE OF Fe.s 


D Cc B A 


1:5 7[2:75 2/92 89159 


75 | :53 | 102 |, +25 


ET 
B 1-25 | 1-43 | 2:08 | 2:25 
C 
D 


2.00 | -80 | 1-80 | 1:00 


TABLE OF (F — Fe) 
F — actual frequency 
Note thatin view of later squaring thesignsareall written as positive. 
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(F 


CRAS 
The next table gives d that is, the numbers in the last 


table were squared and divided by their respective Fe.s. 


D [e B A 


A SOO EE g52 |;4+90) 
B 48 | -57 | 1-48 | 1-85 
Cc +56 | -12 | .50 | -03 
D | 2-00 +29 | 1.80 +50 
TABLE OF eee 
Fe 
'The sum of all the Eu numbers, 


i.e chi-squared) = 18.98. 


F — F.)? 

Y ps E =y% ( 
On consulting Fisher's or Elderton's tables the value of P, the 
probability for x* — 18.98 and 9 degrees of freedom" is equal to 
.025. Thus the chances are 1 in 40 that the deviations of the actual 
from the expected frequencies could be through chance errors of 
sampling. Accordingly, we have grounds for believing that there 
is a contingency or relationship between the variables. 


The Goefficient of Mean Square Contingency 
The coefficient of mean square contingency is given by 


es x 
N+ yx? 


1 See below. 
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In the example we have worked out 
18.98 
E Vz + 18.98 57 
Contingency is a better measure of divergence than association 
and should be regarded as such. Nevertheless, if the number of 
cells used were increased and a finer grouping obtained, C would 


approach in value to that of the correlation only if the distribu- 
tions of both sets of measures were normal or nearly normal. 


A Note on Degrees of Freedom 


Chi-squared tables give the value of the probability P in terms 
of x? and the number of degrees of freedom. This number is not 
usually equal to the number of cells in the contingency table or 
the number of cases, but is usually one less. Nevertheless, as 
R. A. Fisher has shown, the number of degrees of freedom, when 
the marginal totals remain the same sample after sample, will be 
(c — x)(r — 1) where c is the number of columns and r is the 
number of rows. We have to ask ourselves how many cells could 
be filled in from prior knowledge and subtract this from the total 
number of cells in order to obtain the number of degrees of 
freedom; e.g. if we have a 4 x 4 table and can assume that the 
marginal totals remain fixed we should be able to compute the 
fourth row or column in each case knowing the three others. 


The number of degrees of freedom is therefore 
4— 9)4—0-9 


CHAPTER IX 


THE ANALYSIS OF VARIANCE 


dispersion, as a step to correlation, factor analysis and the 

use of the normal curve that the more recent and often 
more useful technique of the analysis of variance has tended to be 
overlooked. It is possible that the influence of Spearman, who 
made such great use of correlation coefficient in his technique of 
factor analysis, did something to hinder the development of the 
more widespread use of the analysis of variance.' 


Goes deviation has proved so useful as a measure of 


Variance may be regarded as the square of the standard deviation 


Ifo = E? 


where N is the number of measures and d is the deviation ofa 
measure from the mean of all the measures. 


(If the measures have been standardized by arranging them as 
deviations from their mean and dividing them by the standard 
deviation the S.D. is therefore the unit of measurement, i.e. 
SD. = band V= 7.) 

If we regard the mean as the first moment about the point from 


1 As has already been noted the psychologist of a generation ago borrowed some- 
thing of the terminology and technique of the Galton-Pearson school of bio- 
metricians. In recent times the work of Professor R. A. Fisher, formerly of the 
Rothamsted Experimental Station, in statistics chiefly concerned with agriculture 
and other biological investigations has been adapted to psychological needs, particu- 
larly by Sir Cyril Burt in this country. The most valuable aspects of Fisher's work 
for our purposes are (a) his methods of designing experiments so that the results 
shall be susceptible to simple statistical treatment (b) the analysis of variance. 
Details of his methods (with particular reference to agriculture) are to be found in 
Statistical Methods for Research Workers and Design of Experiments. Burt's exposi- 
tions have a simplicity and clarity not always to be found in these treatises. 
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which the mean is measured the variance of a distribution may 
be defined as the second moment about the mean: 


= 5B 3)" 


where x is a score or measure 
and x is the mean of the whole distribution. 

Variance as a measure of variability has an advantage because 
it is additive, that is, the total variance of a set of measurements 
may be regarded as the sum of the independent parts or ‘factors’ 
which combine to make up the variance.! 

Oxi = Oa? + Ob? + Oc? +... etc. 
ifx=a+b+e. 

In the analysis of variance the process is reversed and the total 
variance is broken down into those of the several components. 
One of these variances will obviously be due to error in measure- 
ment and usually will be taken to consist of random errors due to 
the smallness of the size of the sample which has been used for the 
investigation. The most frequent and useful application of the analysis 
of variance is to compare the significance of the variance due to some 
particular factor with the amount of variance due to error. 

(It will be recalled that in factor analysis the factors have to be 
discovered in the process of the analysis and their relative amounts 
estimated. In the analysis of variance the possible factors are 
assumed by reference to the given data and the problem is to 
establish their relative significance, that is, to find what is the 
probability that the variance due to each factor is to be accounted 
for as an effect of pure chance. In factor analysis we try to 
determine the relative importance of the inferred factors.) 

Let us consider a set of marks (x) which have been correlated 
with another set (y). Were all the individuals in the x column to 
have the same value there would still remain some scatter in the 
> column, that is, when x is constant there is yet some variability 
in the y scores. When there is correlation between the x and y 

2 
values the variability expressed as a ratio is a . As this is the 
1 See page 54. y 
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proportion of the variance (3) remaining when x is constant it 
may be considered the proportion of the variance in y attributable 
to factors in y other than x. Conversely, the reduction in variance 
when x is kept constant is the part of the variance due to x factor. 


In terms of the entire variance of y the ratio is 


Gy? — oc oct 
Oy? oy* 
Oc? 2 Oc? oy? — Oc? 
pa Lherclore mi —1:—-—:—4 
Oy? oy? oy* 


Accordingly the total variance may be divided into two parts 
of which the proportion due to what is common to x and y is equal 
to r*, and the proportion due to the other factors is 


r? is known as the coefficient of determination. 

[The above is true when correlation is linear and the line of 
regression is straight. Nevertheless, a similar relationship exists 
when the correlation is not linear and the correlation ratio n 


(eta) is used. In this case, the proportion of variance of y is 


2 
separable into two parts: that due to x is = = m? and that due 
s 
Oc? 

to the other factors uec n*.] 
F 
In the analysis of variance the easiest way is to consider the 
average for each class implied by the factor. As, for example, we 
' might require to find whether on the average males or females 
are more intelligent. All we have to do is to find the respective 
means of intelligence-test scores and to determine whether the 
difference between the two means can be attributed to the effects 
of random sampling. Here the classification is dichotomous but 
if we have to consider, in addition to sex, differences arising from 
race or school, we should have multiple classification and should 
have to compare a number of means all derived from the same 

principle of classification. 
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Thus, it is useful in the case of the simple sex classification to 
-find the standard error of the difference between the two averages, 
for this will tell us whether the difference is significant or 
attributable to chance errors of sampling. 
S.E. of a difference of means = 9, = Ni pu + LE 
2 NINA 
N, and N, are the numbers in each of the two sets respectively 
and o, and c, are their standard deviations: 
P.E. = -6745 [os , o2* 
N, + N, 
The S.E. divided into the difference between the averages 
should give a quotient of at least 3, though if it were above 2 it 
might be worth while continuing the investigation. 


Problem 
A test has been applied to five arts students and five science 
students. The marks obtained are given below. The average for 
the arts students is 3 marks more than that of the science students. 
With this small sample is this difference likely to be a matter of 
chance or is it safe to assume that arts students are better on the 
average? 
Arts Students Science Students 
Devia- Devia- 
Name Marks tion Square Name Marks tion Square 


Cowper 21 +i I Maxwel 19 +2 4 

Shaw 19 = 5 I Faraday 14 -g 9 

Scott 18 —2 4 Darwin 18 +1 I 

Stewart 23 +3 9 Dale 15 —2 4 

Lamb 19 —1 I Newton 19 +2 4 

Totals 5)100 o 16 5)85 o 22 
Mean 20 Mean 17 
Average of means ao i EI 18.5 


Deviation + 1-5 Deviation — 1-5 
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'To obtain the standard deviation we divide not by the number 
of each set of cases but by the number of degrees of freedom. This is 
an important conception in statistical analysis. In each column 
there are 5 deviations from a mean calculated from the given 
data. But the total of all the 5 deviations must be zero, and thus 
if we know 4 deviations we can at once calculate the 5th. Accor- 
dingly there are 4 degrees of freedom, i.e. only 4 deviations are 
independent. 

Thus the standard deviation of the individuals in the sample is 


52 eem. S 
ON nisi... GIA B 


= V475 = 2-179 
and the standard deviation of the difference is 
dits me Cn E 
d a a PIT E m9? 
The critical ratio ¢ is given by 
mean, — mean, 20 — 17 3 
94 1376 1:376 
On consulting Yule and Kendall’s “/-table” we find that for 
8 degrees of freedom the probability of obtaining a difference as 
large as this is P = 2(1 — -97) = -06 or 6%. The probability of 
getting a difference as large as this by chance is 6 to 100, that is, 
the odds against getting a difference as large as this by chance 
are about 15 to 1. The difference cannot therefore be accepted 
as really significant. 
Instead of comparing the difference between the means with 
a standard deviation derived from the individual measurements 
we can compare the variance of the means with a variance based 
on the original measurements. 


Firstly, let us reduce all the given marks to deviations about the 
general mean. This is 


2.176 


100 + 85 

A I0 
Then deviation of Art Students mean from General Mean = + 1-5 
n » Science RR 


D 33 » » = 


= 18.5 
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` Now split the marks for each student into three components: 
(1) the general average; (2) the deviation of his group mean; 
(3) hisindividualdeviation above or below thesum of the two means. 


Thus Cowper’s mark is 21 = 18.5 + 1-5 + 1-0. 


MARKS ANALYSED IN DEVIATIONS OF MEANS AND INDIVIDUALS 


Deviations of Deviations of Total Deviation from 
Means Individuals General Mean 
1d 1b 2a 2b ga 3b 
Arts Science Arts Science Arts Science 
+ 165 — r5 — ro + 2.0 + 25 + 0-5 
F r5 — 165 — ro — 3:0 + 0:5 — 45 
| 165 - I5 — 2:0 Fro — 0:5 — 05 
| 15 15 Ego — 20 + 45 — 9:55 
H b L5 — 10 + 2:0 Fo5 + 0:5 
SQUARES OF THE ABOVE, 
2:25 2:25 1.00 4:00 6.25 0.25 
2:25 2.25 1.00 9:00 0:25 20:25 
2:25 2-25 4:00 1.00 0:25 0:25 
2:25 2.25 9:00 4:00 20:25 12:25 
2:25 2:25 1-00 4400 0:25 0:25 
11.25 11-25 16.00 22.00 27:25 33:25 
Sc M LSS RE REEE 
Total 22.50 38.00 60.50 
CALCULATION OF MEAN SQUARES 
Degrees of Sums of Mean 
Source of Variation Freedom Squares Square 
Between Groups 2-1=1 22.50 22.50 
Within Groups 10—2 =$ 38.00 4°75 
Total 10—1=9 60-50 (6.72) 
VARIANCE-RATIOS, OBSERVED AND EXPECTED 
Degrees of. 
Observed Freedom Expected 
3 2.50 
Bee 4:737 1 and 8 5:32 


475 
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The deviation of the mean and the deviation of the individual 
are given in columns 14. 1) and 2a. 2b respectively. It will be 
seen that these add up to the deviation about the general mean 
given in columns ga and 35. Further, in the following table it will 
be seen that the totals of the squares of mean and of individual 
deviations add up to the total of the squares of the deviation from 
the general mean. 

To obtain the *mean-squares' or ‘variances’ we divide each of 
the three square sums by the corresponding degrees of freedom. 
There are 2 deviations for the 2 means, but as these are calculated 
from the general mean of the data one degree of freedom has been 
lost, There are 5 deviations about the mean for arts students and 
5 about the mean for science students, and each set of these is 
calculated from the mean of its group. Hence the number of 
degrees of freedom is (5 — 1 + 5 — 1) = (10 — 2) = 8. As there 
are 10 individual deviations about the general mean these give 
(10 — 1) = g degrees of freedom. 

In the table showing the variance or mean square note that the 
column of degrees of freedom adds up to the degrees of freedom 
of the whole group, and the square sums for the two components 
add up to the square sum of the entire group and this provides 
a useful check. 

As we analyse the total sum of the variances and not the total 
variance, the variances do not add up to the total variance. We 
now proceed to test the variance between the means of the two 
groups. (If the variance to be tested is due solely to error, then 
it should be equal to the error-variance. Hence to test the former 
we divide by the latter.) The variance of the individuals within 

„the group, taken from the mean of either group, is treated as 
denoting the error variance. The probabilities corresponding to 
various values of the error variance F can be found in Fisher’s or 
Snedecor’s tables, and as before a 5% probability may be taken 
as marking the borderline for significance. The table gives 4-737 
in this case which is less than the borderline value. Again, by 
this method we conclude that the difference between the two 
means cannot be regarded as fully significant. 

In the case under consideration F = /? (and we note that 
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VE = 4-737 = 2:176 which was the value previously obtained 
for t). 


Testing the Significance of the Differences between Several Means* 

Where the criterion of classification gives two classes only it is 
adequate to test the difference between the two means by the 
standard error of the difference, that is, by the /-ratio. When we 
have three or more classes it is necessary to use methods involving 
the variance or F-ratio. Suppose that instead of considering the 
abilities of students in only two faculties of a university, we have 
to make a comparison of students in all the faculties. Suppose, 
for simplicity, we consider three faculties only and that the test 
results are as follows: 


MARKS FOR ARTS, SCIENCE AND MEDICAL STUDENTS 


Arts Science Medicine 
Mark Mark Mark Dew. Square 
21 19 18 bem 4 
19 14 16 o o 
18 18 15 —1 I 
29 15 17 +1 I 
19 19 14 — 2 4 
Total 5)100 5)85 5)80 10 
Average 20 17 16 
Deviation + 2-3 — 0-6 — r6 
Square 544 0-4 2-7 


It is unnecessary to repeat the deviations and squares for arts and 
science students. It is also unnecessary to repeat the means, etc., 
for every person tested. We have simply to multiply the square 
of each mean by 5 (the number of individuals) and then take the 
sum; or more simply to sum the squares first (5-4 + 0-4 + 2-7 
= 8.6) and then multiply the sum by 5. We obtain 5 X 8.6 

11 am indebted to Sir Cyril Burt for the treatment of this problem and for the 


subsequent account, taken from his laboratory notes, of his adaptation of Fisher's 
methods. 
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The sums of the squares of the individual deviations within each 
of the three groups (calculating from the corresponding group 
mean) are 16 + 22 + 10 — 48. The square-sums for the 15 
deviations for the general mean (17.6) need not be calculated, 
except as a check. 

Tabulating the results as before, we obtain the mean squares 
as follows: 


CALCULATION OF MEAN SQUARES 


Source of Degree of Sum of Mean 
Variation Freedom Squares Squares 
Between Groups 3—1-— 2 43:3 21.6 
Within Groups 15— 3 = 12 48-0 4:00 
Total I5— 1-14 91.3 


VARIANCE RATIOS, OBSERVED AND EXPECTED 
Observed Degree of Freedom Expected 


= 2 = 542 2 and 12 3.88 


The ratio of the two variances is now 5-42, well above the value 
we should expect with 2 and 12 degrees of freedom. Thus there 
can now be little doubt that the difference of faculty does after 
all tend to produce slight but genuine differences in the average 
marks obtained by the test. 

For purposes of illustration we have taken tiny samples with 
5 individuals in each. But the numbers in each sample need not 
be the same, and indeed may be so large that the sums of squares 
are best calculated from grouped frequencies. With continuous 
variates it is then better not to use Sheppard's correction but to 
keep the grouping fine. 

The method may be conveniently used to test the significance 
of the correlation ratio. Treating the groups as ‘arrays’ in a 
correlation-table, we have 


Sum of Squares between Groups — 43.3 


rm 
" Total Sum of Squares 81-3 533 


Hence n = -730 (by consulting Yule and Kendall table, p. 454): 
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Testing the Significance of a Difference between two Means 
Problem: Consider the marks allotted to the four pupils as 


follow Tom Dick Harry George Total Average 
Arithmetic 29 24 14 L 68 17 
English 29 28 15 4 76 19 
Drawing 32 27 27 22 108 27 
Handwork 34 29 28 25 116 29 
Total 124 108 ° 84 52 368 92 
Average 31 27 21 13 92 23 


Take, to begin with, two pupils only. The average mark allotted 
to Tom is 31, to Dick 27. Can we safely infer from this that Tom's 
general ability is significantly greater than Dick's, or (since we 
have used only 4 tests) is it more likely that the difference results 
solely from chance? 


1st Method: Standard Error of the Mean Difference 
As before, the most obvious procedure is to calculate the 
standard error of the difference by the usual formula. 


CALCULATION OF STANDARD ERROR OF DIFFERENCE 


I 2 3 4 5 
Test Tom Dick Diff. De. Squares 
Arithmetic 29 24 H25 FT 1 
English 29 28 +1 — 3 9 
Drawing 32 27 +5 +1 I 
Handwork 34 29 +5 +i 1 
Total 124 108 + 16 o 12 
Average 31 27 + 4 o 


Since Tom's and Dick's marks may be correlated, it is simpler to 
calculate the detailed differences instead of the S.D.s of the marks 
observed and their correlation. The calculation is shown in the 
first 3 columns of the last table. 

The deviations of the differences about the mean difference 
(+4) are given in column 5. As usual, to find their standard 
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deviation we add the squares of the deviations (column 4), but 
we divide by the number of degrees of freedom. When we started 
there were n = 4 items, and therefore 4 “degrees of freedom" (i.e. 
4 figures that vary independently). But in taking deviations about 
a mean calculated from the observed data, we have lost one degree 
of freedom: for, when we know the first 3 deviations (or any 3 
deviations), we can fill in the 4th from the fact that the total 
must be o. 

Hence to find the ‘mean square’ we divide, not by 4 but by 3. 
This mean square (12 + 3 = 4) is the ‘variance’ of the individual 
differences: and its square root (2) would be their standard 
deviation. 

But we require the standard deviation of the mean difference. 
To obtain the variance of a mean, we divide the variance of the 
individuals by the number of individuals. We then obtain 
4-4 = 1. The square root of this gives the standard deviation. 
In the absence of any other information we must take the standard 
deviation of the mean difference thus calculated, as the best 
indication of the standard error of the mean difference. Accord- 
ingly, to test the significance of the mean difference (m) we 
divide it by its standard deviation. Using the /-ratio as before, 
we obtain 


(where om = 4/(Zx* = n (n — 1) ) ). 

From the /-table given by Yule and Kendall (p. 536) we find 
that, with 3 degrees of freedom, a value of t = 4 gives y = -986. 
Thus, the chance of getting a difference so large as this (in either 
direction) would be P = 2 (1 — .986) = .028 or 35 to 1 against. 

The method indicated above has certain limitations although 
it suffices for the actual problem which is given. We may desire 
to test the significance of differences not only between two pupils 
but between all the pupils in the class, but it would involve a great 
deal of work to prepare every pair of pupils by the method given. 
Even if we did this the general picture would still not be clear, as 
it is impossible to draw the general inference from the pairs 
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considered severally. We need a more comprehensive analysis of 
all the data which has been given. This is given by a general 
method of analysis of variance on the following lines: 

The 8 observed marks set out in columns 1 and 2 are formed by 
the deviations of 8 performances about the average performance 
of both boys in all four tests (i.e. about the average mark of 29). 
The 8 deviations are given in columns 3 and 4. Instead of measur- 
ing the total amount of deviation by the sum of the 8 deviations 
(which would be zero unless we ignore the signs) we can measure 
it by the sum of the squares of those deviations. The squares are 
given in columns 7 and 8. 


CALCULATION OF TOTAL VARIANCE FOR TWO BOYS 


Mean Deviations Squares Squares of 
Tom Dick Tom Dick of Means Deviations 
Test i 2 3 4 5 6 7 8 
Arithmetic 29 29 o -5 841 841 o 25 
English 29 29 o -1 841 841 o 1 
Drawing 29 29 3 -2 841 841 9 4 
Handwork 29 29 5 o 841 841 25 o 
m . 
Total o 3364 3364 34 - 30 
LIES 
6728 64. 
CE 
6792 
MEANS AND DEVIATIONS 
Means of Means of. 
Means Boys Tests Deviations Totals 


Tom Dick Tom Dick Tom Dick Tom Dick Tom Dick 
1a 1b 2a 2b 3a 3b 4a 4b sa 5b 
Arithmetic 29 29 +2 —2 —2.5 —2.5 +05 —0.5 29 24 
English 29 29 +2 —2 —o05 —0.5 — I5 + 1.5 29 28 
Drawing 29 29 +2 —2 +05 tos--os-os 32 27 
Handwork 29 29 +2 —2 +25 +25 +05 —905 34 29 


SQUARES OF ABOVE 


Test 1a 1b 2a 2b 3a 3b 4a 4b sa 5b 
Arithmetic 841 841 4 4 6.25 6.25 0.25 0.25 841 576 
English 841 841 4 4 0.25 0.25 2.25 2.25 841 784 


Drawing 841 841 4 4 0.25 0.25 0.25 0.25 1024 729 
Handwork 841 841 4 4 6.25 6.25 0.25 0.25 


Total 64 336. 16 16 13.00 13.00 3.00 3.00 3862 2930 
3394 304 3 3 


6728 . 32 26 6 6792 
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Components 

Our task is now to analyse these gross deviations into their chief 
components. Each deviation may be regarded as the sum of 
3 deviations: (i) the mean deviation of the particular boy above 
or below the general mean (29); (ii) the mean deviation of the 
particular test above or below the general mean; (iii) the indivi- 
dual deviation of each mark from the sum of these two means. 
This subdivision is shown in the table of Means and Deviations. 
Observe that, in combination with the general mean, the three 
figures add up to the original marks, appended in the last two 
columns. 

We now square all these figures and enter them in the Table of 
Squares where they are analysed. We notice that the component 
sums at the bottom of the table add to the grand total (6792). 

We are not concerned with the squares of the general mean 
(6728). What interests us is the partition of the sum of the square 
of the unanalysed deviations (64) into the sum of the sums of 
the squares of the three components. We observe that 

64 = 32 + 26 + 6 


The Variances 
We can now proceed to test the significance, not only of the 
variance due to the differing means of the 2 boys, but also of the 
variance due to the differing means of the 4 subjects. As before, 
what we shall test is not the differences between the means, but 
the total variance of the means. The sums of squares and the 
„degrees of freedom by which we divide them are tabulated in the 
first two columns of the table. The result of the division is given 
in the last column. 


ANALYSIS OF VARIANCE: (TWO BOYS) 


Source af Sum of Degrees of Mean 
Variation Squares Freedom Square 
Boys 32 2—1-—1 32 
"Tests 26 4—1- 8.6 
Error 6 4— 12-39 2 


Total 64 8-—1 = 7 (9-14) 
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Degrees of Freedom 

Since the deviations of the 2 boys’ means and the deviations of 
the 4 test means are calculated about the general mean, we must 
deduct one degree of freedom from each. The same is true of the 
deviations of the 8 performances: but this we only need as a check. 
The boys’ variance and the test variance are the variances to be 
tested, and so form the numerator of the variance-ratio. And 
since a variance, not a difference, is being tested, we require for 
the denominator, not the standard error, but the error variance. 
The only part of the data that we can use to indicate the error 
variance will be the deviations of the 8 performances from the 
sum of the means, i.e. the deviations shown in columns 4a and 45. 
There are 8 figures; but in calculating these figures from the 
original 8 marks we have already used 5 degrees of freedom (1 for 
the general mean; 2 — 1 = 1 for the boys’ means; and 4 — 1 = 3 
for the test means). Hence only 3 degrees of freedom are left, Tt 
is easy to see that, if we take any 3 figures in columns 4a and 4b 
say + 0-5, — 1.5, + 0:5, we can deduce the other 5, because we 
know that the sums of both columns and rows must all be zero. 


Significance Test 

To test significance, we now take the ratios of the variance of 
the boys’ means, and then of the test means, to the variance due 
to ‘error’. 


VARIANCE RATIOS (F), OBSERVED AND EXPECTED 


Degrees of 
Source Ratio Observed Freedom Expected 
Boys Fs + 3 E18 1 and 3 10-1 
Tests -— F; 23 = 43 3and3 9:3 


Thus the difference between the two boys is fully significant, but 
the differences between the tests (applied to only two pupils in 
this part of the inquiry) is not significant. 
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Relation between the Two Alternative Methods 


Since, with 1 and 3 degrees of freedom, an F-ratio of 10-1 
represents P = 0.05, we might guess, by rough interpolation, that 
an F-ratio of 16 would represent P = 0.03 or thereabouts (the 
value obtained with the first method). In fact, we note as before 
that F = 72, for F = 4 and t = 2. 


Testing Reliability 

There is no reason why the two columns of observed figures, 
like those set out in columns 1 and 2 above, should always 
represent persons, or the rows should always represent tests. For 
example, if we had applied two tests to four (or more) persons, 
then the headings ‘Tom’ and ‘Dick’ would be altered to ‘1st Test’, 
‘ond Test’; and the side-titles would be the names of the persons 
tested instead of names of school subjects. This is the form the 
data take when we wish to test the reliability of two successive 
applications of the same test. The two means of the columns will 
now represent difficulty of tests, or possibly the improvement 
shown in the second test as a result of practice or familiarity with 
the first; and the means of the pair of marks in each row the 
average ability of the boys tested. Unless the averages for the 
boys differ significantly, the test is failing to differentiate between 
the several tested, and so is devoid of reliability. ‘The usual 
measure of the amount of reliability is, of course, the correlation 
between the two columns. 


Testing the Significance of the Differences between SEVERAL Means 
Problem 


‘The advantages of the second procedure are most evident where 
we desire to test the significance of the differences between the 
means, not for two boys only, but for several — say four. As 
before we can at the same time test the significance of the differ- 
ences between the means for the four school subjects. Subtracting 


the general mean (23) from the figures in the table for four boys 
we have 
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DEVIATIONS SQUARES OF DEVIATIONS 


Test Tom Dick Harry George |Total |Mean| Tom Dick Harry George |Total 
Arithmetic + 6+1 —9 -—22 |—24| —6| 36 1 81 484 | 602 
English + 6+5 -8 —19 |-16| —4| 36 25 6. 361 | 486 
Drawing + 9+4 +4 — 1 |+16| +4] 81. 16 d $E 
Handwork + 11 +6 +5 + 2 |+24| +6] 121 36 25 4 [18 


Total +32+16—8 —4o o o | 274 78 186 850 |1388 


Mean + 8+ 4-2 —10 o o 


Components 

We now analyse these deviations into the same three com- 
ponents as before, namely (i) the mean deviation of each boy; 
(ii) the mean deviation of each test; (iii) the deviation of each of 
the 8 performances from the sum of the two means. These are 
shown in the first table below. The reader should check the fact 
that for each performance the three components add up to the 
deviation shown above. 


The squares of these deviations follow: 


ANALYSIS OF DEVIATIONS SQUARES 
Tom Dick Harry George Tom Dick Harry George 
(1) Deviations for Boys (1) Squares of Deviations , Total 
Arithmetic +8 +4 —2 —10 64 16 4 100 | 184 
English +8 +4 —2 —10 64 16 4 100 184 
Drawing +8 +4 —2 —10 64 16 4 100 | 184 
Handwork +8 +4 -2 —10 64 16 4 100 184 
= ru) 
Square Sum 256 64. 16 400 736 
(2) Deviations for Tests (2) Squares of Deviations | Total 
Arithmetic —6 — —6 —6 36 36 36 3 n 
English —4 —4 —4 -—4 16 16 .— 16 16 4 
Drawing +4 +4 +4 +4 16 16 16 16 64 
Handwork +6 +6 +6 +6 36 36 36 36 | 144 
Square Sum 104 104 104 104 416 


(3) Deviations for Performances o Squares of Deviations | Total 
1 


Arithmetic "Eu dee rcr Men 9 1 36 62 
English +2 +5 —2 —s 4 25 4 25 58 
Drawing =g 4 Pag 9) 16 As da | E 
Handwork —3 —4 +1 +6 9 16 1 36 62 


Square Sum 38 66 10 122 236 
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Error 

Provisionally we shall treat the four tests as random (and 
therefore uncorrelated) specimens of tests for “general ability”: 
that would mean that we can take the last set of deviations (the 
residuals) as due to ‘error’. Strictly this assumption should be 
tested first of all: and in fact we shall presently see that it is not 
tenable. But for the present we are concerned only to illustrate 
the procedure for simple cases first. 


Degrees of Freedom 

The degrees of freedom are calculated as before. The easiest 
way to decide the degrees of freedom for the ‘error variance’ is to 
subtract from the total degrees (15) the degrees for the other two 
items (3 + 3 = 6): that is equivalent to subtracting from the 
total number (16) the number of constants used to calculate the 
deviations for error (1 + 3 + 3 = 7). 

We can now tabulate the calculations for the mean squares (or 
‘variances’) in the same way as before. 


ANALYSIS OF VARIANCE: (FOUR BOYS) 
Source of Sum of Degrees of Mean 


Variation Squares Freedom Square 
Boys 756 | 4—1=3 2453 
Tests 416 4-1=3 138-6 
Residual 236 16—7=9 26.2 
Total 1388 — 16— r — 15 (92-5) 


Significance Test 
The variance ratios are calculated as before. 


VARIANGE RATIOS (e), OBSERVED AND EXPECTED 


Degrees of 
Source Ratio Observed Freedom Expected 
2453 — 
Boys Fs AAE 9:36  3andg 8.8 
8.6 
Tests F; Bes — 5.28  3andg 8.8 


26.2 
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The degrees of freedom are now larger than before because we 
have taken 4 boys instead of only 2. And once again the differ- 
ences between the 4 boys appear to be fully significant, but (with 
error assessed as above) the differences between the 4 tests are not 
significant. 


Testing Reliability 

Suppose that Tom, Dick, Harry and George are the names of 
four examiners marking test performances by four boys in the 
same subject. Thus the names of the rows down the left-hand 
margin of the table are names of candidates taking the tests. We 
can now use the analysis of variance to measure the reliability or 
self-consistency of the whole examination. We could vary this by 
making the headings of the columns four component tests instead 
of four different examiners. The reliability coefficient is given by 

P-E 


LEY ri 
P 


where P is the mean square for pupils or candidates and E is the 
mean square for error based on the residuals.' 


Testing the Significance of Group Factors (Interaction) 
Problem 

The foregoing are the simplest and commonest types of case in 
which the analysis of variance can be applied. We now proceed 
to introduce a further complication. 

In estimating the variance for error, we assumed that the 
deviations of the 8 performances from the combined means of 
boy and test were random deviations. A glance at the figures 
headed ‘deviations for performances’ on page 161 is sufficient to 
show that they are not random, but correlated. We must there- 
fore treat them as containing yet another component — a bipolar 
component. This is technically termed interaction, because the 
type of boy tested ‘interacts’ with the type of test used, ie. an 

1 This is developed by Burt in The British Journal of Educational Psychology, XV, 


pages 80-92. The use of factor analysis for a similar purpose is given in BURT, Marks 
of Examiners. 
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academic type of boy does well in the academic type of test, 
whether Arithmetic or English and, by comparison, badly in the 
practical type of test: conversely for the practical type of boy. 

This bipolar component we can assess by averaging the devia- 
tions in each column, reversing the signs of the last two to prevent 
the totals adding up to zero. We then calculate the deviations 
about these further averages. Thus the variance of the deviations 
for performances can itself be analysed along the same lines as 
before. 


(4) Deviations for Bipolar Component (4) Squares of Deviations 
Arithmetic +3 +4 -15—55 9 16 2.25 3025 57.5 
English 90 4 —r5-—55 9. 3X6 2335 30.28 57.5 
Drawing =3 —4 rs +:5.5 9 16 2.25 30.25 57.5 
Handwork —3 —4 +15 +55 9 1G 2.25 30:25 57.5 

Square Sum 36 64 9  I21 230.0 


(5) Deviations for Error 


Arithmetic I —1 +05 05 I 1 0.25 0.25 2.5 
English 1 =f —0.5 + 0.5 I 1 0.25 0.25 2.5 
Drawing o o. +05 —o5 o0 090/25 0.25 10:5 
Handwork o o —0.5 405 o o 0.25 0.25 0.5 


Lj 
N 
^ 
o 


Square Sum 


The degrees of freedom for the ‘bipolar component’ will evi- 
dently be 3; and those for the ‘deviations for error’ will evidently 
be 6. We have thus split what we previously assumed to be ‘error’ 
into two components. Note that both the square-sum and the 
degrees of freedom now obtained add up to those previously 
assigned to ‘error’ in the table of the analysis of variance for four 
boys, 

We must now analyse the total variance afresh. 

(In setting out tables like the following the beginner finds it best 
to set the obtained figure first, the degrees of freedom next, and 
the calculated or textbook figures last, since that is the order of 
working. The experienced worker, however, will put the degrees 
of freedom first, since they really indicate the structure and 
fundamental conditions of the analysis.) 
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ANALYSIS OF VARIANCE: (WITH FOUR COMPONENTS) 


Source of Sum of Degrees of Mean 

Variation Squares Freedom Square 
Boys 736 4—1-—8 2453 
Tests 416 4-1=3 138-6 
Interaction 230 4—1=3 76.6 
Error 6 16 — 10— 6 1-0 
Total 1388 1—1=15 (92.5) 


The observed and expected variance ratios may be tabulated as 
follows. The divisor is now 1-0 in every case. 


VARIANCE RATIOS 


Degrees of 
Source Ratio Observed Freedom Expected 
5% 1% 
Boys Fs 245:3 3 and 6 4:76 978 
Tests F, 138.6 3 and 6 476. 978 
Interaction F; 76.6 3 and 6 476 978 


Thus, when we allow for the fact that the tests are highly 
correlated, and thus confirm one another far more strongly than 
a random set of tests, the differences between boys, between tests, 
and between types of boy (or test) appear highly significant. 


Application to Factor Analysis 

It will now be seen that we have demonstrated the statistical 
significance of (i) the “general factor” of average ability, and (ii) 
the ‘group factor’ of academic versus practical ability. Thus,- 
provided the factor-measurements are obtained by simple 
averaging, we have found a convenient method for testing the 
significance of factors. 

(The high significance thus obtained with a sample consisting 
of 4 boys only may seem surprising. But the correlations are 
equally high. Thus, the observed correlation for Arithmetic and 
English is .99 and the residual correlation .92. Now with 4 items 
the 1% level is -99 and the 5% level -go. But we have not one 
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correlation but 6 in each case, though not all 6 residual correla- 
tions will be independent. Thus the rough test of significance 
applied to the correlations confirms the more precise test obtained 
by analysing variance. However, it should be remembered that 
the figures given in this example are purely artificial, chosen to 
simplify the mental arithmetic, rather than to illustrate the kind 
of figures actually obtained.) 


Interaction 

When planning a research which will involve the analysis of 
variance the 'factors' are chosen not so much because they 
operate independently but because they can be controlled and 
measured. Thus it is necessary to devise methods of research 
wherein the joint effects of the varying factors may be compared 
with their isolated effects, and it is possible that the joint effect 
will not be the mere sum of the respective effects. We can adapt 
the methods given by Fisher in his Design of Experiments where the 
investigations concerned agriculture (manuring of fields, rotation 
of crops, etc.) to our educational problems. Much investigation 
remains to be done on suitable teaching methods for children of 
various ages and capacities and in various subjects. We might 
use (a) oral methods alone, (/) film strip, (c) cinema film, (4) prac- 
tical work and exercises, and (e) a combination of two or more of 
these methods. We might expect that combinations of the 
methods might be more effective than the use of a single method, 

In the analysis of variance what is known as error is the com- 
bined effect of various influences which either cannot be or are 
not controlled in the investigation. Certain precautions must be 
observed in order that we can estimate this error. With small 
sampling techniques it is necessary~to secure the replication or 
repetition of individual items with similar factorial content. 
Where the ‘interactions’ are known or can be shown to be signi- 
ficant they may be used to measure error. Secondly, within the 
conditions imposed by the experimental design the items should 
be assigned at random. Randomization may be secured by a 
mechanical method such as tossing coins, drawing cards or by 
using sets of random numbers. Fisher used the name ‘randomized 
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blocks’ for an experimental design which involved these principles. 
Eight blocks of land are selected and each is divided into five 
plots. Five varieties of a particular kind of crop, or five types of 
fertilizer are assigned at random to each plot. We could translate 
this into a research in education by testing the relative merits of 
five different methods of training. Such problems as the methods 
of teaching various processes in arithmetic, improving memoriza- 
tion or treating delinquents would be susceptible to such treat- 
ment. Obviously the children to be studied will differ according 
to home and school environment and thus the children used in 
the investigation are chosen from eight schools. Children of 
about the same age are picked at random from the schools and a 
different method of training is allotted at random to each indivi- 
dual, In analysing the results there will be only one criterion of 
classification — that according to training or treatment. 

But if the number of performances is large enough the number 
of ways in which they are classified or cross-classified may be 
increased from two to three or more. 

Example: We wish to investigate the efficacy of four different 
training methods (e.g. the remedial teaching of backward spellers). 
Four boys are selected and all four will be subjected to all the four 
methods. To obviate possible differences arising from the test 
words used in the experiment, all the words will have to be taught 
by all the methods. It is possible, even probable, that the order 
in which a boy is taught by the different methods may make some 
difference to the result. For instance, if he starts with a phonic 
method and goes on to a copying method, the latter might be 
helped by the former. Again, if he starts the week with a phonic 
method and goes on to the others on subsequent days this might 
affect the results. Thus, as far as can possibly be managed it is 
necessary so to arrange the order that, with one boy or another, 
each method follows and precedes every one of the others. 

The following arrangement, which meets these requirements, 
is known as the Latin Square as the Roman or Latin capitals 
A, B, C and D represent the four methods. When a further 
classification is necessary Greek letters are used and the 
arrangement is then known as a Graeco-Latin Square. 

M 
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ARRANGEMENT OF TEACHING METHODS IN A LATIN SQUARE 


Order Tom Dick Harry George 
1 A B Cc D 
2 B D A C 
3 G A D B 
4 D c B A 


We will now express the marks în the tests designed to examine 
the teaching methods. For convenience in analysis these have 
been:arranged in the form of deviations from the general mean. 


RESULTS OF TEACHING 
Test Material Tom Dick Harry George Total Average Square 


i 26 15 -3 — 10 28 7 49 
ii 22 5 1 — 18 8 2 4 
iii 10 1 9 2 - 16 4 16 
iv — a -9 -7 2 — 20 25 
‘Total 36 12 —20 —28 o o 94 
Average 9 3 —.$ —73 o 
Square 81 9 25 49 164 


'To calculate the averages for each training method we rearrange 
the figures in each column as follows: 


Ks iy eo goni Dick Harry George Total Average Square 
2 1 


m a 24 6 36 

B 22 15 S 2 32 8 64 
C — 10 —9 EA — 18 — 40 — I0 100 
D AES 5 =) — 10 — 16 - 4 16 
Total 36 12 — 20 — 28 o o 216 


From each figure in the last table but one we now subtract the 
sum of the appropriate averages for (i) the boy, (ii) the test 
material, and (iii) the teaching method. We obtain the following 
residuals: 

RESIDUALS AND THEIR SQUARES 


est 
Material Tom Dick Harry George Total Tom Dick Harry George Total 
H 6 86 


A 4 e's Sea D E 9 25 3 

un 3 ye aS. o 9 16 16 9 50 
iii —5 —4 4 5 o 25 16 16 25 82 
I2 S5 4 o 4 9 25 16 54 


"Total o o o o o 54 5o 82 86 272 
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The sums of the squares are tabulated below. In entering those 
for each of the means we have multiplied the squares from a 
single column or row by the number of columns or rows (in this 
case 4), since the means are repeated in each column and in 
each row. 


ANALYSIS OF VARIANCE (LATIN SQUARE) 


Source of Degrees of Sum of Mean 
Variation Freedom Squares Square 
Boys 3 656 218.6 
Test Material 3 376 125.5 
Teaching Methods 3 864 288.0 
Residuals 6 272 45:3 
Total 15 2168 


VARIANCE RATIOS, OBSERVED AND EXPECTED 


Degrees of 
Source Observed Freedom Expected 
i 5% 1% 
Boys eee 4-82 gand6 4.76 978 
453 
Test Material CE 2.76 3and6 476 978 
45:3 
: 288.0 
Teaching Methods pa: = 6:35 gand6 4-76 978 


'The differences in the effects of teaching are fully significant 
but those for the boys are only just over the borderline. There is 
no discernible difference in the different types of teaching 
material. 

With a more elaborate experiment we could study the inter- 
actions, that is, the differences in effect of teaching methods on 
particular types of pupil or test material It has been assumed in 
the above example that the ‘interaction’ can be taken as a measure 
of error for the main effects. 
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Methods of Working 


In actual practice it will involve considerable labour to work 
with the actual deviations, for the means will usually involve 
decimal fractions. The following procedure will make the 
arithmetical work simple and mechanical. It will be illustrated 
from the problem on page 168 involving three criteria. Here are 
the steps of the process: 


1. Find the totals of the rows and the columns, and the grand 
total. 


2. Divide the totals by the number in the corresponding row, 
column or table. 


3. Multiply each total by the corresponding mean. 

This may be done by a calculating machine, but if one is not 
available, square the means, and multiply by the number of 
items on which each mean is based. (The result is obviously the 
same, but the ‘total x mean’ method avoids any mistakes in 
multiplying the squares, when the number of rows differs from 
the number of columns.) 


4. Add the products. 


5. With the Latin Square rearrange the rows and find the 
‘totals x means’ as before. 


6. Square each figure in the first table and find the grand total 
of the squares. 


7. From each of the four totals thus obtained, subtract the 
product of the grand total by the grand mean. The results are 
the square-sums for the various means and the total square-sum. 


8. To find the square-sum for the residuals, subtract the sum 
of the three square-sums for the means from the total square-sum. 
The final result can be checked by directly calculating the squares 
for the residuals, at least approximately. 
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WORKING METHOD. STEPS I, Il, I AND IV 


Test 
Material Tom Dick Harry George | Total | Mean Product 
i A 46 B 35 CI? Dro | 108 27 2916 
ii B 42 D 25 A19 Gre: 88 22 1936 
iii C ro A21 Dir B22 64 16 1024 
iv D 18 C 11 B 13 A 18 60 15 goo 
Total 116 92 60 52 | 320 80 6776 
Mean 29 23 15 13 8o 20 
Product 3364 2116 goo 676 | 7056 6400 
STEP V 
A 46 21 19 18 | 104 26 2704 
B 42 35 13 22 112 28 3136 
Cc 10 II 17 2 40 10 400 
D 18 25 II 10 64. 16 1024 
‘Total 7264 
STEP VI 
i 2116 441 361 324 | 3242 
ii 1764 1225 169 484 | 3642 
iii 100 121 289 4 514 
iv 324 625 121 100 | 1170 
Total 4304 2412 940 gr2 | 8568 
STEP VII 
Crude Correction 
Square Sum Term 
Boys 7056 — 6400 = 656 
Test Material 6776 — 6400 = 376 
"Teaching Methods 7264 — 6400 - 864 
"Total 8568 = 6400 = 2168 
STEP VIII 


Square Sum for Residuals 2168 — (656 + 376 + 864) = 272 


Such comparatively simple analysis may lead to more elaborate 
experimental designs such as those in which there may be two or 
three criteria of classification, one or two essential interactions and 
several items instead of only one in each sub-class. The technique 


1 See Sir CYRIL Burt’s report on "Teaching Backward Readers’, British Journal 
of Educational Psychology, XVI. 
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may also be extended to the testing of simple and multiple 
regressions and their linearity. This is given in Mather, Chapters 
VIII and IX. It may also be applied to intra-class correlation (see 
Fisher, Statistical Methods) and to the analysis of covariance. The 
latter is necessary where the criteria of classification may be not 
independent but correlated. Suppose it is necessary to test 
alleged differences in educational attainments between children 
in various parts or towns of a county at a transfer examination. 
It may be that the age composition may vary from one part to 
another. Regression must then be used to eliminate the effects 
of differing age. This is best done by analysing the covariance 
as well as the variance. The method is given in Snedecor, 
Chapter VIII. 

The works of Fisher, Snedecor, Yule and Tippett mentioned 
in the bibliography may be consulted for more advanced work on 
the analysis of variance. 


APPENDIX I 


GRAPHS AND GRAPHICAL METHODS. 
THE DIFFERENTIAL CALCULUS AND 
TRIGONOMETRICAL FUNCTIONS 


simple statistical investigations. In fact, for those who have 

only the slightest knowledge of mathematics they will often 
prove to be the only means of dealing with the results of an 
investigation, lists of scores and so on. Even where the investigator 
is well equipped mathematically graphical method still remains as 
the best means of recording and interpreting results, in many 
cases. 

Graphs make an immediate appeal to the eye. Even where 
there is little “aptitude for figures the visual image is the one 
above all others, which can be most easily remembered, analysed 
and interpreted. 

Graphs give a picture of the variation of one quantity with 
another, and properly interpreted the graph will provide a clue 
to the extent and nature of this variation. 

Unless the investigator knows something of the calculus, of 
exponentials, etc., the graph is often the only means of representing 
the variation. Finding the areas enclosed by graphs is an easy 
way of ‘integrating’; tangents drawn to points on curved graphs 
anticipate the process of ‘differentiating’. Maximum and mini- 
mum points are easily seen and interpreted. With a graph, 
interpolation is possible, that is, intermediate values between the 
plotted points may be found. A curve or line may be extended by 
having regard to its general shape and hence finding further 
values which are outside the range of the points that are plotted. 
This is known as extrapolation. The processes of interpolation 
and extrapolation are not to be undertaken lightly. In the former 
case intermediate values should be found by experiment and 
observation particularly where a curve turns sharply. In the 
latter case the continuation of a line is a very risky procedure for 
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Gime methods of expression will prove very helpful in 
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factors may come into play which alter the general trend and in 
psychological investigations these ‘tails’ may have considerable 
significance. Interpolation and extrapolation should be applied 
on the merits of each case and then with care and reticence. 

A point xy may be fixed on a plane surface by referring it to 
two axes. It is convenient to draw these as straight lines at 
right angles. If the horizontal and vertical axes divide the graph 
paper into four equal parts we can provide for an equal number of 
x and negative x values and of y and negative y values. If we are 
only concerned with positive values of x and y it will suffice to 
draw the axes respectively at the bottom and at the left side of 
the paper. Distances are measured from the origin which is the 
point o where the axes intersect, and it is conventional to regard 
values measured to the right and upwards as positive and those 
to the left and downwards as negative. To plot a point xp it is 
necessary to measure along the x axis a distance x and upwards 
a distance y. It is necessary to consider carefully what scales can 
be employed for both x and y values, in other words, how many 
units of x and y are represented by a division on the graph paper. 

If a straight line is drawn on the graph paper it will contain 
a series of points which represent values of x and y which are 
related together in a simple way. x and y are connected together 


Y 


SY 
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in terms of a simple equation, appropriately called a linear 
equation. The value of yis dependent on that of x: y is known as the 
dependent variable, and x the independent variable. y becomes a 
function of x and is sometimes written y — f(x). 

Let us first consider a straight line drawn through the origin o 
and at an angle 0 (theta) with the axis of x(ox). 

Consider any point P on the line. 

Its co-ordinates, that is its x and y values, are related together by 


2 = tane ory = xtan6 
The slope of the line can thus be thought of as the tangent of the 
angle which the line makes with the axis of x.t The equation of 
this line has already been given: it is y = x tan 6 and this connects 
all the x and y values on the line. 


Y 


When the line does not go through the origin but meets the 
axis of y at a point cutting off a piece oc (c) on it, it will readily 
be seen that the equation of the straight line is y = tan 6x + € 
for every y value corresponding to an x in the previous equation 
of the line through the origin will have to be increased by the 
intercept c on the axis of y. 

' See page 189. 
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Any equation which can be put in the form lx + my + n = 0 
where l, m and n are independent of x and y can be represented on 
a graph as a straight line. 


In this case, the slope of the line — — 


and the intercept on the axis of y = — — 


A linear relationship is said to exist between two sets of measures 
if a straight-line graph is yielded when points representing 
corresponding sets of values are plotted and joined. 

The use of straight-line or other graphs as ready reckoners, 
conversion tables, etc., needs no stressing. 

A few words should be said about regression lines. The line 
J = rx gives the regression of y on x and x = ry gives the regression 
of x on y. Where r = 1 (perfect correlation) the line y = x goes 
through the origin and makes an angle of 45? with both axes. 
The older school of statisticians would say that when correlation 
was perfect there was no regression, but some writers make r the 
correlation coefficient (and slope of the regression line) a direct 
measure of the regression. From the context it is usually easy to 
sce what a writer intends to convey. Regression gives us a measure 
of the reliability of predicting the value of a measure by reference 
to that of another with which it is correlated to a greater or 
lesser degree. 

The calculus is best approached by considering the graphs of 
curves. We may look upon differentiation as a process of measur- 
ing rate of change, curvature, etc., and integration as one of 
summation, the determination of areas, etc. Differentiation and 
integration may be regarded as one the reverse of the other. As 
these processes involve conceptions relating to infinity and 
infinitesimals care must. be taken to see that these ideas are not 
given the form of absolute numbers. 

Suppose the curved line represents a function f(x) of x. Its 
equation is y = f(x). Consider a point on the line P, whose 
co-ordinates are x and y. Further, take another point P, near to 
it with co-ordinates slightly larger x + 8x and y + 5x, where 9x 
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and 8y (delta x and delta y) are small increments in the value of 
x and y respectively. 

Now consider the small triangle P,MP, with vertical side 5 
and base 8x. Its hypotenuse P,P, will approximate to a portion 
of the curve as 8y and 8x become smaller. 

P,P, will be a tiny part of a tangent to the curve as P, and P, 
approach one another. 

The slope of this tangent = > 

> =f) 
ego 8) le + Ba) 
~ by = f(x + 8x) — f(x) 
By _ fle + 82) — fis) 
5 ox 


It is necessary to utter a word of warning that the rigorous treat- 
ment of the calculus must be regarded as being beyond the scope 


| of this short statement. z is a true quotient obtained by dividing 
x 
small but finite quantities 8y and 8x but when we proceed to the 


limit and obtain the differential coefficient O this must not be 


d $ 
regarded as a fraction but as an operator y; acting on y. The 


| differential coefficient of a function is spoken of as its first deriva- 
tive and is represented by fı (x). 
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A simple example will show this method in use 
Suppose y = x? 
J + 5 = (x + Bx)? 
dei cer + 2x0x + 5x? 
4.28) — x? + axbx + Sx? — x* 
= 2x Bx + ox 
= 
“eg oe + öx 


Now making 5y and 6x smaller and smaller 
dy 
dx 
(This is where this method, though simple, lacks rigour, for we 


= 2x 


assume that 6x vanished but that 2 becomes 2 The above 


method might be regarded as'a useful demonstration rather than 
a proof.) 


To find the differential coefficient or the derivative for x" we 
need to keep in mind the binomial expansion for (x 4- a)" 


n (n — 1) 


1X2 


(x -F a)" = x" + mia + x2 ai 


+ n(n—1) (n= 2) 
1 X2 x5 
nsus 

v + Sy = (x + 8x)" 


= x" + mx") a 


De Pa Tem a" 


hye wg 0D D tx HB. 


Duet Ap 


= 2 
Bx Ties x ONES ed 


term containing higher power of 8x. 
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Proceeding to the limit 
z = n"! as terms containing 8x and its powers vanish in the limit. 

As the differential coefficient gives a measure of the slope of the 
curve it will be equal to o where the curve has no slope, that is to 
say at the points of the curve where the tangents are horizontal. 

Thus, we find values of x which correspond to maximum or 
minimum values of the function by equating the differential 
coefficient to zero and solving the equation. 

This method will not distinguish between maximum and mini- 
mum values but it can readily be seen that, as we trace out a curve, 


Differentiation Integration 
d : f CARP 
Eon (pac) ONE aux 
dx VEI j n+I 
g logex =- | g loge x 
— = — = e 
dx ES x x 
d . 
— cos x = — sin x sin x dx — — cos x 
dx f 
B . x 
—S ( — COS S = SL 
a sin x — cos x f cos x dx = sin x 
d , 
— ta = sec: 3 = tan x 
p: tan x = sec: x f sectă dx = t 
d 
- cot x = — cosec*x f cosec*x dx = — cot x 
dx 
d X 
-— ms = x 
E cr Í e dx =e 
d A constant which can be deter- 
gy should be regarded as | mined from the practical nature of 
: i ata has 
an operator and not as a the problem and the given d ha 
fraction to be added in each case. This is 
; obvious when it is remembered that 
integration is the reverse of differen- 
tiation and that the differential co- 


efficient of a constant is zero. 
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a tangent to the moving point will turn în a clockwise direction 
as we approach and pass a maximum value and it will turn 
in an anticlockwise direction as we approach and pass a mini- 
mum. 

"Thus, a further process of differentiation (double differentiation) 
will give us a clue to the recognition of maxima and minima. 

: A de Jr 
1f the second differential coefficient Z has a positive value the 
x 

point concerned will be a minimum and if it has a negative value 
the point will be a maximum. 


Trigonometrical Functions of an Angle 


SP 


s 
E m 

-x 0 Ye x 
Consider the right-angled triangle POX with angle POX = 0°. 
PX (perpendicular) = p, OX (base) = b, OP (hypotenuse) = ^. 


sine 9 y: cotangentg — 5 
(sin) h (cot) E p 
cosine 6 b secant 8 h 
(cos) FA (sec) b 
tangentó — f cosecant 8 h 
(tan) b (cosec) n 


It will readily be seen using the properties of a right-angled 
triangle that each of these functions may be calculated by knowing 
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anyoneoftheothers. Thefollowing relationshipsare mostimportant 


sin . 
tan 6 = —— , sint 0 + cos?’ 0 = I. 
cos 8 


I I 1 
cot 0 EOD vou uc San 
cos (go? — 6) = sin sin (90° — @) = cos6. 
The angle 6 must not be regarded as an angle limited to less than 
a right angle. A triangle of reference POX may be drawn by 
dropping a perpendicular PX from a point P on the line OP 
generating the angle on to the axis of X, — X.OX. Although the 
tables only give angles between 0° and go” the trigonometrical 
functions for other angles may be calculated by arranging them 
as (180? — 6), (180° + 6), (360° — 8) where 6 is an angle less than 
go“ which can be found from the tables. The following diagram 
shows when it is necessary to change the sign of the function 
found in the tables. Angles are measured in an anticlockwise 
direction and the complete round of angles (360°) is divided into 
four quadrants 
(180? — ) sine + |All + 
cosec + 


(180° + ) tan + |(360* — ) cosine + 
cot + sec + 


S|A 
or in the mnemonic form by using the word CAST: rhe 


It may be useful to remember that: 


sin 0° =0 cos O° =1 tan o° =o 

sin 30° =} cos 30° = 2 tan 30° = = 

; I I = 

sin 450 — —- cos45°=—= tan45 = ! 
2 V2 

sin 60° = v3 cos 60° = $ tan 60° = 4/3 


sin go? = 1 cos 90° — 0 tan go? = cc (infinity) 
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Sometimes mental tests, mental “factors, etc., are represented 
as vectors, that is, straight lines at an angle to one another. The 
correlation coefficient between the quantities represented by any 
two lines is given by the cosine of the angle between them. The 
projection of one line upon another is equal to the length of the 
first line multiplied by the cosine of the angle between the lines. 
(Do not confuse this with regression and remember that the ‘slope’ 
of a line is given by the tangent of the angle which it makes with an 
axis of reference.) 

Factors, etc., represented by vectors at right angles are obviously 
uncorrelated (cos 90? = o) and they are said to be orthogonal. 

Factors, etc., represented by vectors which are not at right 
angles contain some measure of correspondence (the cosine of the 
angle between them is not zero). These aresaid to be oblique factors. 

This useful idea can be extended from two dimensions to three 
(and analytically without trying to conceive models to 4 or more. 
The geometry of hyperspace can be used for dealing with more 


gonal factors can be thought of as lying along the edges of a 


rectangular box and meeting at one of its corners. A number of 


oblique factors could be drawn as lines in space radiating from a 
point. If an arbitrary line were taken to represent the first factor 
the other lines could be imagined to fit into their relative positions 
by taking the correlation coefficient between each pair, finding 
the angle of which it is the cosine and fitting in the line accord- 
ingly. With three lines this involves a simple principle of solid 
geometry but with four or more analytical methods using algebra 
and trigonometry may have to suffice. Angles are not always 
given in degrees, and it is often more convenient to think of them 
in radian measure. 

21 radians = 360° 

T radians — 180? 

180° 


T 
When the symbol m appears in formulae used in psychological 
and educational statistics it usually refers to an angle of two right 
angles or 180°. 


I radian — 


than 3 factors which are represented by vectors). Three ortho- . 


= er 
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THE USE OF THE SLIDE-RULE: 


that of the invention of logarithms, is really a simple instru- 

ment working on logarithmic principles. To multiply two 
numbers we add their logarithms. If, therefore, we have two 
scales whose distances and divisions are measured out in the 
lengths of the logarithms which they represent it is easy to see that 
numbers may be multiplied by adding these logarithmic lengths 
by means of two scales one of which is capable of sliding against 
another. Division may be performed by subtracting these logarith- 
mic lengths, squaring by doubling and finding a square root by 
halving and so on. In our work the slide-rule is particularly useful 
when each of a set of numbers has to be multiplied (or divided) 


Te slide-rule, which dates from about the same period as 


„by a factor, as for instance in reducing a set of marks from one 


maximum to another. One setting of the rule is all that is required 
and the reduced marks may be read off directly from the rule. 

Although most work in educational and psychological statistics 
does not call for the full resources of the instrument such as is 
used by engineers, it is worth while to acquire a good one, which 
will cost from gos. to £3. The beginner need not feel overwhelmed 
by the amount of metrical material compressed into one scale. If 
any difficulty arises it will suffice to make a simple slide-rule by 
gumming two strips of logarithmic graph paper to two ruler-like 
pieces of wood respectively which can be made to slide against 
one another and may be kept together by a couple of small elastic 
bands. No difficulty is expected, however. 


Finding Numbers 

The front face of the ordinary 10-inch slide-rule consists of two 
pairs of scales; the upper ones usually are called the A and B scales 
and the lower pair are known as the C and D scales. 


„A See also the section on Logarithms in The Teaching of Arithmetic and 
Elementary Mathematics, by the author. 
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Any number of whatever reasonable magnitude can be located 
on the slide-rule, because the first mark can be called 1, 10 or 100 
as required. The sub-division of units sometimes gives difficulty 
at first but since there are only three different variations to learn 
these should be mastered at the outset. 

If we call the first mark on the A scale 10, the number 11 is to 
be found five graduations (division marks) further along, the 
space between 1o and 11 is divided into five parts, with graduation 
marks at 10-2, 10-3, 10-6, 10-8 leaving any smaller divisions to be 
estimated as required. This method of marking continues until 
20 is reached, after which the spaces between the whole numbers 
are not large enough to allow five divisions, so from thence on- 
wards the units are only cut in half. From 50 to 100 there is not 
even room for this to be done and the units are no longer sub- 
divided. 

On the D scale there is more room as ‘smaller’ numbers are 
involved. If the beginning is called 10, the number 11 is found 
ten marks further along, the intermediate values being 10-1, 10-2, 
etc., to 10.9 and this sytem is continued up to 20. From 20 to 40 
the units have five divisions each, e.g., 20-2, 20-4, 20-6, 20.8, after 
which there is only sufficient room for half divisions to be shown. 

If a 10-inch slide-rule is examined carefully so that these facts 
are appreciated facility in finding and reading numbers will soon 
follow. 

It is always worth while to perform rough mental calculations 
of the answer as this will help to find the correct place for the 
decimal point. 


1. Multiplication 


Example: 14.6 x 3-2 (approximate value 50). Put Br (the 
beginning of the B scale) against one of the numbers on the A 
scale. Locate the second number on the B scale and read off the 
product from the A scale immediately above the B scale number. 
The fine vertical line of the transparent window of the sliding 
cursor may help in reading a number on one scale which is 
exactly in line with a number on the other, 
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In effect, in this process of multiplication a piece of the A scale 
has been added to a piece of the B scale and, as the numbers are 
multiplied together by adding their logarithmic lengths, the total 
length indicates the products of the two numbers. 


2. Division 

Example: 43-6 + 19.8 (estimated approximate value 2). 

Place the divisor 19-8 on B scale immediately under the divi- 
dend 43-6 on the A scale. The quotient may be read off on the 
A scale immediately above B 1. In division a piece of Scale B is 
subtracted from a piece of Scale A. To divide two numbers we 
subtract their logarithms. 

Both multiplication and division can be performed on the C and 
D scales. The results can usually be estimated to a greater degree 
of accuracy owing to the larger divisions, but working is generally 
a little slower than with the A and B scales. 


3. Conversion and Reduction 

These processes are equivalent to multiplying or dividing the 
given number by a certain factor. It will be seen that division by 
a number is equivalent to multiplication by the reciprocal of the 
number, e.g. division by 12 is equivalent to multiplication by 
zy or 0833. Each case must be considered on its merits, that is, 
whether it is easier to multiply by a factor or divide by its recipro- 
cal. Example: To convert marks given with a maximum score of 
80 to a maximum of 100. This is equivalent to multiplying each 


mark by A or 1.25. For ease of working it is better to put Br 


opposite to 1-25 on the A scale and read off the result on the A 
scale immediately above the given number on the B scale. After 
the initial setting no further movement of the scale will be required 
for the whole set of marks. 

The conversion of marks from a maximum of 100 to one of 8o 
need not be regarded as a division but rather as a multiplication 
by the factor .8. 
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Squaring Numbers 

Find the number on the D scale. Its square lies immediately 
above it on the A scale. Use the cursor. The scales all remain at 
‘zero’ position. 


Finding Square Roots 

Find the number on the A scale. Its square root lies immediately 
below it on the D scale. Now any number which is given in figures 
without a decimal point will appear to have a choice of one of 
two square roots (quite apart from negative roots), e.g. the square 
root of 4.0 is 2-0 but that of 40 is 6.3. Thus there are two positions 
for any number on the A scale, and the correct one must be 
chosen with reference to the size of the given number according 
to the following rule. For numbers with an odd characteristic use 
the right-hand part of the A scale. For numbers with an even 
characteristic use the left-hand part. The characteristic is one 
less than the number of digits to the left of the decimal point, and 
if negative is one more than the number of noughts immediately to 
the right of the decimal point, e.g. 


3167 characteristic 3 odd 
316.7 characteristic 2 even 
9:6 characteristic o even 
3076 characteristic —1 odd 
.0003001 characteristic —4 even 


In using tables of square roots the same principle applies, but it is 
usually sufficient to make a rough mental estimate of the required 
value and this will determine which of the two given numbers is 
required. 


A Note on Calculating Machines 

Where the statistical analysis of the data of much educational 
research has to be undertaken the routine labour necessary to 
make the large number of calculations can be reduced by using a 
calculating machine. As we have already shown the principal 
formulae which are used in educational research can be cast into 
forms which are particularly convenient when calculating 
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machines are used. A typical machine suitable for our purpose 
would be the Fridén Model D-10 which contains 10 columns of 
keys. Such an instrument will not only perform the processes of 
complex addition, subtraction, multiplication and division but 
it will also extract square roots. No useful purpose will be served 
by giving instructions here concerning the use of particular 
machines and the student is advised to obtain instruction from the 
retailers or commercial users of such machines. The student 
may need practice in thinking in terms of decimals and decimal 
fractions and in making rough estimates. The serious worker in 
this field will be equipped with graph paper; ruled paper in large 
sheets with 4” squares, tables of logarithms, squares, square roots 
and statistical tables. 

1 A useful booklet of instructions concerning the working of the machine men- 


tioned here, together with a series of graded exercises in the use of the instrument 
is published by Bulmer's Calculators Ltd., 54 Kent Road, Harrogate. 
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PASCAL'S TRIANGLE AND THE NORMAL 
CURVE OF DISTRIBUTION 


long run heads and tails will be about equally divided and 
the distribution will be in the proportion 

HUE 

TEST 


Gies that we toss a penny a large number of times. In the 


If we toss two pennies there will be three possibilities: two heads, 
two tails, one head, one tail, in the proportion:* 
HADID bl Lea RA 
SS 
I 2 I 
With three pennies there will be four possibilities: three heads, 
three tails, one head two tails, one tail two heads in the proportion 
HHH HHT HTT TIT 
I 3 3 I 

and so on. Although we do not find these proportions strictly 
observed unless we take inconveniently or impossibly large num- 
bers of cases these figures represent the probabilities of the 
distributions of each particular showing of heads and tails. 

This at once suggests to us that it may be useful to consider the 
numbers arising when we continue to multiply 11 by itself, that is, 
the powers of 11 


(11) II 
(11)* 121 
(11)* 1931 


(11)* 14641 
1 Students of biology will note that these are the proportions of offspring showing 
distinct transmissible characteristics in the simplest application of Mendel's laws, 
e.g. in the second generation of peas in the crossing of long and short peas, pure 
long peas, impure long peas and pure short peas were in the proportion I, 2. I 
respectively. 
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In building up Pascal's triangle we must continue the powers of 
11 without carrying additions above 10 into a higher column. 
"Those who are familiar with the binomial theorem will see that the 
above continuous multiplication by 11 gives the coefficients in the 
binomial expansion (1 + x)" of the ascending powers of x. 

Thus (1 + x)* = 1 + 4x + 6x? + 4x* + x* by the expansion of 
(11)*or 1464 1. 

If it can be imagined that we continue Pascal's triangle to the 
limit making the number of the power n sufficiently large we should 
arrive at the exponential curve known as the probability curve or 
the curve of error. If instead of thinking of the smooth curve 
which is reached in the limit, let us imagine the histogram 
given at the bottom of this page. 


I 5 10! 100 het 
1 6 15! 20. 15.40 2I 


Pascal’s triangle 


It will readily be seen that the area of the whole figure repre- 
sents the total number of cases ie. 1 +4 +6+4 41> 16, 
the height of any column the frequency for each distribution of 
heads and tails and the distance from the centre point of the hori- 
zontal line (the x distance) the degree of departure from the central 
or most common tendency (the mode), in this case, two heads and 


I 4 6 4 I 
TITT 


HHHH HHHT HHTT 
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two tails. The chance that a single throw of four coins will give a 
particular number of heads and tails is given by the area of the 
column concerned compared with that of the whole figure., e.g. 
the chance of throwing four heads (or four tails) is 1 in 16. 

If we now return to consider the histogram ‘smoothed out’ and 
its area representing a large number of cases, it is easy to appre- 
ciate that the probability that a measure (x) will lie at a certain 


distance from the central point is given by the ratios of the area i 


of the tail of the curve beyond that point and the area of the 
remainder of the curve cut off by an ordinate through the point. 
In some cases (and these should be obvious) it will only be 
necessary to consider one half of the curve, that is, one or other of 
the halves on either side of the central line. 


Some Properties of the Normal Curve of Distribution 

This curve is also spoken of as the curve of error, the Gaussian 
curve or the curve of probability for reasons which we have 
already mentioned. 

The curve is a member of the family of exponential curves, that 
is, it is related to the growth function e. The exponential function 


e has a rate of growth equal to itself i.e. E I 
x 
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The curve may be defined as a frequency curve whose height 
at any point is inversely proportional to the antilogarithm of half 
the square of the distance, measured in terms of the standard 
deviation as the unit, of that point from the mean. 


The formula for the curve is y =o $a where x and y are 
points on the curve with respect to o the central point on the x 
axis and yo is the ‘height’ of the curve at its central point, that is, 
the distance which it cuts off along the y axis. 


o is the standard deviation. 


If this is large the curve is flat at the top and if this is small the 
curve is sharp and pointed. 

The degree of curvature is spoken of as kurtosis. 

For our purpose we must regard y as a frequency of a score x 
which is referred to the average as zero. 

We will differentiate the function representing the curve of normal 
distribution, written as: 


where N is the number of cases in the distribution and o is the 
standard deviation. 


s N 
Let us write — = = c a constant 
Oy 21r 
gi 
y=ceit 
dy =x d(— x*/20%) -x 2x 
=ce . = ioe) | — — 
dx Ee dx (ege) ( ES) 
Ch = — că 
= — Zei = 
o? g? FE 


If we substitute x = o in this derived function (first differential 
coefficient) it vanishes. 

Thus, this represents a value where the curve is at a maximum or mini- 
mum value. lt is easy to see that this is actually a maximum. 

Let us try to find other points where the curve has a maximum 
or minimum value, i.e. where it is horizontal, 
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Equate the first derived function to zero. 


— că 
= =0 

o? £203 

ae — că H 
Divide by — EE 

er ez 

ES 
02 = cc 


E 


Taking logs. loge (e 28) = cc 


x? 
scs (logee) — e - 


x? 
Now logee'— 1 — = 
20° 
“XP = 
Xx = 


Thus, the curve is horizontal at infinite distances from the 
centralline. (It is necessary to give a word of warning about the 
above demonstration. We have used ‘infinity’ as though it were a 
number and this may lead to absurdities. "The above is not à 
rigorous demonstration and it is wise to warn the student against 
using ‘plus and minus infinity. Here we have unfortunately: 
had to sacrifice rigour for the sake of a simple demonstration.) 

Students who have proceeded a little further with the calculus 
than we have done here will be able to continue and find the 
second derivative or differential coefficient of the function of the 


curve of normal distribution. 
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It will be observed that at symmetrical points of the curve there 
are points of inflexion, that is, the convex curvature of the top 
part of the curve gives way to the concave lower portions on each 
side. The rate of curvature will obviously be zero at these points. 
We can find them by equating the second derivative of the function 
to zero: 


ET ci 
Dividing through by eu T (42 — o) =0 
‘ = oF x=+Ł+0 
Thus the points of inflexion are at a distance o from the central point. 
Let us consider the curve drawn on such a scale that its area is 
unity. The total number of cases N given by the area of the curve 
will be represented by unit area. 
At the centre point or origin where x = o the equation of the 
curve becomes 


o I 


I 
= 0 = — = 
Varro 4/27 0 
I : : n : : 
Thus ——— is the height of the curve at its maximum (its 
27 c 
modal ordinate) or the intercept cut off by the curve on the axis 
of y. 


> 


-x 
The area of the curve ——— e Te can be found by integra- 


aro 
tion. The curve must be thought of as extending from an infinite 
distance to the left of the centre point to an infinite distance to 
the right. 
The total area is given by 


+ I — 
f L qu 
$ —« IVA 2m O 
which is equal to 1. : 
[If this exponential curve could be considered as a development 


from the expansion of the binomial (3 + 3)" the sum of all the 
ordinates is 1 for (1 + 3)" = 1" = 1] 
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The expression representing the normal curve may be written 
I 
DS 


I =x? 
where z = —— e ia? 
om 


From statistical tables we may find values! of z for various 


x 
values of — 
Ox 


If the curve has unit area and unit standard deviation y = Z 


I E 
andy = z = —— e T 
om 


If N is the area of the curve the equation of the curve of normal 
distribution is 
INIM con 
J.= = 6m 
on/ 2T 


It is often necessary to find the area of a curve which lies 
between the central line and a vertical line at a distance from the 
origin, or the area of the ‘tail’ of the curve beyond a given value 
of x. Tables are provided of the values of such areas in Chapter V. 
These are usually denoted by q. It will be seen that the sum of 
these two areas is equal to the total area of the curve on one or 
other side of the central line. The value of these areas may be 
found from statistical tables or in any particular case by inte- 
grating the formula for the curve between limits, e.g. the area of 
the tail of the curve beyond a point x, on one side of the curve is 
given by 

Te I 


1'This value of z is not to be confused with Fisher's 2! which is the hyperbolic 
arc tangent of r the correlation coefficient, i.e. 2! = tanh ~ 17. 
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The Principle of Least Squares 

This important principle for finding a line of best fit may be 
justified by the use of the formula for the normal curve. 

If we can assume that the distances of the points from the line 
(or errors) make a normal distribution, the frequency of a par- 
ticular error (i.e. a point at a distance of x from the line) is given 
by 


and the probability of its occurrence = & 


where N is the total number of points. 
The probabilities of the occurrence of the errors X, Xs, Xa, etc., 
are given by 


n a s 
9E 2628 ate ea S oe 
N ZEN SN 
The probability that these errors will occur simultaneously is 


given by their product. 


etc., respectively 


ESAE FO T ) 


N” 
Jo" N=" 


(uy? yg? + raea) 


z 

ere 

The value of P will be greatest when the denominator is least 

and this will occur when (x;? + xa? ks +.. .) is minimized 

and thus produce a maximum probability of the concurrence of 
the errors. 
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THE SPEARMAN RANKS FORMULA FOR 
CORRELATION 


Ea oM 
eU O NN) 
Sums and Differences Formulae for r 


Suppose d is the difference between any two paired scores when 
these are expressed in terms of deviations from the means of their 
respective series. 


_ Ze) aa £6 229 2» 


2 
E N N N NIN 
Multiplying. the middle term 222 by 22 we obtain 
N ox oy 
2 2x 
og? = Ox? + oiea E + Ox Oy. 


Od? = Ox? + oy? — 27 Ox Oy 
Lum ox? + Oy? — og? 
2 Ox oy 
The formula still holds if we work with the differences between 
raw scores instead of the differences of deviations from the mean. 
If D is the difference between raw scores then o4 = o; and the 
formula becomes 
Ux. ox? + oy? — 0p? 


2 Ox Oy 
If the variabilities of the two arrays of scores are equal (as they 
will approximately be in two forms of a test) ox? = oy* and the 
formula reduces to 
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The formula may be further simplified if it can be assumed 
that the means as well as the variabilities of the two arrays are 


equal. 
SAL : Ce = 
o2 = — — =) 


where D is a difference between corresponding raw scores. 
ZD = 3(X — Y) = (ZX — ZY) = (NMx—NMy) N (M, — My). 


DN: 
But if the means are equal Mx — My = o and therefore CG) Or 


=D? 
2 No? 

This is a useful formula to employ in the correlation of two forms 
of the same test or two halves of one test, and it is also important 
because the Spearman ranks correlation formula is developed 
from it. 


‘Thus r=I 


Ifo? is the square of the standard deviation of a set of n ranks 


(12 40729 pate iere E09 (H a ia ai A 
n 


n 


A CAT n Gy 
Le, "g*z-——— 
n n 
By adding the identities (n+) —m = gn* + gn +1 
n — (n— 1)? —3(n— 1)? + 3(n— Tet 
(n— 1)? — (n— 2)? = (n—2)?+3(n—1) +1 


g? 


22—1*—91'-9g1d1 
(n + 1)? — 1 32m + 32n tn 
Zn is the sum of the first n natural numbers, i.e. half the sum 
of the first and last number multiplied by the number of terms. 
n(n + 1) 


Zn = ———— 
2 
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Substituting in the identity n? + 3n* + 3n = 3Zn* + 32n + n 
I 
gr = m + sm + sn — gn ED — n 
6En* = on* + 6n* + 6n — 3n(n + 1) — 2n 
6zn*! = 2n* + gn? n 
= n(2n + x)(n + 1) 
. n(2n + 1)(n + 1) 
EE ar 


Substituting in our variance formula 
=n? Zn: 
s:=—-— (— we have 
n 


nan + 1)(n-F 1) mat i)’ 
Kur 6n = 4n* 


=n? 


c? 


Now if p is the correlation coefficient between pairs of scores 
assuming that the variabilities and the means of the two sets of 
ranks are equal 


Dn xd: 
gc 2nc* 
Sant 6zd: 
which gives p=1-— "ant 


by substituting for o? 


This can also be demonstrated in a simpler way: 


It would appear from the following identities: 


IA g* —4X4X(4— 1) 

a + 4^ =% X 5 X (5*— 1) 

I? + 37+ 57 =2 x6 x (6* — 1) i 

224 q2+62=Ix7x(p-—1) l 
that the sum of the squares of consecutive odd numbers or consecu- | 


tive even numbers beginning with 2 as far as N—1 is JN(N*— 1). 
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Now consider the following cases of perfect negative rank 
correlation (i.e. p = — 1): 
Order of Merit Order of Merit Difference in 


Case (rank) (rank) rank squared 
(Nisodd) in subject P in subject R d 
A I 7 64 
B 2 6 4^ 
C 3 5 2: 
D 4 4 o? 
E 5 3 2? 
F 6 2 4 
G 7 1 6* 
(Niseven) in subject P in subject R d 
A I 8 yt 
B 2 7 5: 
G 3 6 ex) 
D 4 5 ză 
E 5 4 1 
E 6 3 3 
G 7 2 5* 
H 8 I p 


It will be seen that in both cases where there is perfect negative 
correlation Zd? = 4N (N: — 1). Obviously when the ranks are 
identical and there is perfect positive correlation Zd* = 0; there- 
fore it is reasonable to suppose that when there is zero correlation 
(i.e. half way between — rand + 1) Zd? is half way between 
4N(N? — 1) and o, i.e. 9N(N* — 1). 

Now if Sd? were determined by chance alone (no correlation) 
it would have the value 4N(N? — 1). 

Thus WHS gives a measure of the lack of association 
between the ranks or the variance of the set of ranks. 
: i zd: 

The correlation coefficient p — 1 — iN 3 

i : 62d? 
which can be written 1— N (N = "ij 


o 
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A NOTE ON CORRELATION AND 
REGRESSION LINES 


onsiper N numbers As, As, As. . . Denote their mean by A 
and the differences or deviations of the numbers from their 
mean by 4), ds, ds . . . Etc., SO that the mean of these is 0. 


i 
The standard deviation is =) 


If og = 1 the numbers 4; da as are said to be in standard 
measure. (Alternatively, we could have achieved the same result 
by dividing the deviations from the mean by the standard 
deviation.) 

Consider a second set of N numbers B,, Ba, B... . and in the 
same way derive from them ba, ba, ba... and o» the standard 
deviation of this set. 


by definition 


The coefficient of correlation ra» = T 
Noa 65 
Consider the identity 


Zz(apb, — aabp)* = (Za*)(Zb*) — (Zab)? 


= Nhoa?ov (1 — rab?) 


It follows from this that if ra = + 1, Y = “ for all values of 
p da 
p and q giving a straight line relationship between each A and the 
corresponding value of B. (Note that as the left-hand side of the 
identity, being a square, cannot be negative, rap cannot lie outside 
the limits — 1 and + 1.) 

Normally no such exact linear relation exists but we may find 
the line of best fit by finding one which will make the sum of the 
squares of the distances of points from it a minimum. 
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Choose ^ and u so that Z(b — ^a — #)* is a minimum. 
Differentiating partially with respect to ^ and K, we obtain 
— 22a(b — ħa— H) =0 = — 22(b — ^a — v) 
which as Za = Na = o and similarly 25 = o 


reduce to — 2(Noadvrab — ANoa?) =O = — 2( — Nu) 
Oar 
and thus = 921% and u = 0 
Ca 


The line of regression of B on A is given by 


he (=) S 
Oa 


and the line of regression of A on B is given by 


ae (2) b. 
oa 

If the As and Bs are quite independent ras will approximate to 
zero if N is large enough. The converse is only true for a linear 
relationship. In the case of the parabolic curve bt = a, ray Would 
equal o and we should use the correlation ratio instead of the 
coefficient. Thus, independence involves zero correlation, if N is 
large enough, but zero correlation does not necessarily imply 
independence. ' 

The product-moment formula for r may be obtained from the 
regression line by a simple method, which is complementary to 
the above. 

Let the equation of best fit be J= bx (its slope will be b). 

Consider the points which represent the paired scores on a 
scatter diagram. It will suffice to take their ordinate distances 
from the line as these will bear a constant relationship to the 
normal distances from the points to the line which are actually 
considered in the method of least squares. The error in the 
ordinate by which a point x. y. misses the line is (y —2)., For 
best fit Z (y —))* must be a minimum 


z(y—)9-z(y— bx) = (By? — 2b 29 + pa Zi) 


1 Adapted from ‘Mathematics and Psychology’, PIAGGIO, Mathematical Gazette, 
February 1933. This paper also contains ‘An analysis of the factor g, if it exists’. 


202 APPENDICES 


As we are finding b the slope of the curve we must differentiate 
the expression with respect to b and equate to zero. 


Thus —22Zxy--2bzx—o 
NO Ig 
E 


(This is called the regression formula for y on x and is often used 
in economic statistics in this form.) 

AR Za: ly 

Dividingby N b WN 


But = ox. Thus box: = 2 
2x) 
~ Nos 
We now have to standardize our deviations x and y by dividing 


them by their respective standard deviations o and oy. 
Let the slope of the line after this standardization — 7. 


b 


- O. O, Oy 
Buty = bx me = bx r=b— .andb = r2 
Ox 9, Ox 


On substitution r = b 2 = Po ACTUM] 
vc, Noc, Noxo, 
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VALIDATION OF TEST MATERIAL 


that it is supposed to test, It is measured by the correlation 

between the test-scores and the scores obtained from a reli- 
able standard, ifone is obtainable. In view of the widespread use of 
intelligence tests the need for new ones is apparent but these must 
be adjusted and modified until they give a high correlation (at 
least .9) with a well-tried intelligence test such as the Terman- 
Merrill Revision of the Binet Scale. 

Item validity has to be measured by reference to the test itself. 
The measurement of the difficulty of the items is an easier matter 
for it can be found by the proportion of the children who are 
unable to give the correct answers. A useful and satisfactory way 
of finding item validity is known as the method of upper and 
lower thirds, This works in the following manner. 

(1) Arrange the scripts in order of merit, highest scores at the 
top and lowest at the bottom. These scores are called criterion 
scores. 

(2) Divide the scripts into three equal groups: upper (U), 
middle (M) and lower (L). 

(3) Calculate the percentages of children in each group who 
answer a particular item successfully. 


Te validity of a test is the degree to which it tests the ability 


Here is an example: i 

The table below is prepared from a list giving the mark gained 
by each of 37 boys for each question. The scores are divided into 
three groups based on total score and the percentage in each 
group getting a particular item correct is calculated. 


Column U = % correct in upper 10 
M = % correct in middle 17 
L = % correct in lower 10 
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and the difficulty D := 100 — 


APPENDICES 
The validity V = U —L 


U+M+L 


N.B. — A slide rule is of great assistance in doing these quickly. 


a 


Question | U% M% L% 4 b 
I 73 64 22 51 47 
2 77 53 22 55 49 
3 82 36 I1 7 57 
4 78 36 o 73 64 
5 70 68 74 -4 29 
6 50 9 o 50 80 
7 32 3 II 21 85 
8 o o o o 100 
9 50 II 17 33 74 
To 86 39 33 3 64 
Ir 73 85 30 43 27 
I2 45 ui o aoe ha 
VALIDITY VETE 
I 00- 
3 
8o- 4 
5 
60- " 
6 
40- 1|2 
pee) 
20-| II 
1 
12 8 
I0 20 30 40 50 60 70 80 90 100 


Percentage Difficulty 


VALIDATION OF TEST MATERIAL 205 


The test can now be constructed after a study of the difficulty 
and validity values which have been tabulated. There are no 
hard and fast rules but the following orders of difficulty might 
prove to be satisfactory: 

About 20% of the items of difficulty ranging from 0-40 
» 60% » » » » » 40-60 
» 20% » » » » » 60-90 

From a number of items considerably larger than those which 
are required to make up the final test those items having the 
highest validity in each category are selected. Kelley has shown 
that an improvement in this method is effected by taking the 
upper and lower 27% instead of the upper and lower thirds.* 

[ For further details see: G. A. FERGUSON, The Reliability of. Mental Tests, London, 


1941. LoNG and SANDIFORD, The Validation of Test Items, University of 'Toronto, 
1935. 


TABLE OF SQUARES AND SQUARE ROOTS 


APPENDIX VII 


OF NUMBERS FROM 


z 
= 
Š 
3 


OO 0-10 nuna 


Square 
Li 


Square Root 
1.000 
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Number 
36 
37 
38 
39 
40 


I 


TO 1000 
Square Square Root 
12 96 6,000 
13 69 6.083 
14 44 6.164 
15 21 6.245 
16 oo 6.325 
16 81 6.403 
17 64 6.481 
18 49 6.557 
19 36 6.633 
20 25 6.708 
21 16 6.782 
22 09 6.856 
23 04 6.928 
24 01 7.000 
25 00 7.071 
26 o1 7.141 
27 04 7.211 
28 o9 7.280 
29 16 7.348 
30 25 7416 
31 36 7.483 
32 49 7.550 
33 64 7.616 
34 81 7.681 
36 oo 7.746 
3721 7.810 
38 44 7.874 
39 69 7-937 
40 96 8.000 
42 25 8.062 
43 56 8.124 
44 89 8.185 
46 24 8.246 
47 61 8.307 
49 00 8.367 
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Number Square Square Root Number Square Square Root 


7 50 41 8.426 116 1 34 56 10.770 
72 51 84 8.485 117 1 36 89 10.817 
73 53 29 8.544 118 1 39 24 10.863 
74 54 76 8.602 119 141 61 10.909 
75 56 25 8.660 120 1 44 00 10.954 
76 57 76 8.718 121 1 46 41 11.000 
71 59 29 8.775 122 1 48 84 11,045 
78 60 84 8,832 123 15129 11.091 
79 6241 8.888 124 1 53 76 11.136 
8o 64 00 8.944 125 1 56 25 11.1 
81 65 61 9.000 126 1 58 76 11.225 
82 67 24 9.055 127 1 6129 11.269 
83 68 89 9.110 128 1 63 84 11.31 
84 70 56 9.165 129 1 66 41 11.35! 
85 72 25 9.220 130 1 69 00 11.402 
86 73 96 9.274 131 171 61 ie 
87 75 69 9.327 132 1 2 24 11.489 
88 7744 9.381 133 1 76 89 11.53 
89 79 21 9.434 134 1 2 56 11.57 
90 81 oo 9.487 135 1 82 25 11.619 
91 82 81 9.539 136 1 84 96 11.662 
92 SA 64 9.592 137 1 87 69 11+705 
93 86 49 9.644 138 1 90 44 11.747 
94 88 36 9.695 139 1 9321 11.790 
95 9o 25 9.747 140 1 96 oo 11,532 
96 92 16 9.798 141 1 98 81 11.874 
97 09 9.849 142 201 64 11.91 
98 96 04 9.899 143 2 04 49 11.958 
99 98 o1 9.950 144 2 07 36 12.000 
100 1 00 00 10,000 145 2 10 25 12.042 
101 10201 10.050 146 2 13 16 12.083 
| 102 10404 10.100 14 2 16 09 GRE 
{ 103 1 06 09 10.149 14 2 19 04 12.16 
4 104 1 08 16 10.198 149 22201 12.207 
| 105 1 10 25 10.247 150 2 24 00 12.247 
106 1 12 36 10.296 151 22801 12.288 
107 1 1449 10.344 152 2 31 04 12.329 
108 1 16 64 10.392 152 2 34 09 12.369 
109 1 18 81 10.440 154 23716 12.410 
110 1 21 00 10.488 155 2 40 25 12.450 
111 I 2321 10.536 156 2 43 36 12.490 
112 12544 10.583 15 2 46 49 12.530 
113 1 27 69 10.630 15! 24964 12.570 
114 1 29 96 10.677 159 25281 12,610 
ns 1 32 25 10.724 160 a $6 00 12.649 
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Number 
161 
162 
163 
164 
165 


166 
167 
168 
169 
170 


171 
172 
173 
174 
175 


176 
177 
178 
179 
180 


181 
182 
183 
184 
185 


186 
187 
188 
189 
190 


191 
192 
193 
194 
195 


196 
197 
198 
199 
200 


201 
202 
203 
204 
205 


Square 
25921 
2 62 44 
2 65 69 
2 68 96 
27225 


27556 
2 78 89 
2 8224 
2 85 61 
2 89 00 


Om NT 
RU ORD 


[DE 
Ans 


undă BepeN 


www 
Apo Ur 

goo NEDO 
OBE Seres 


p 
«o coco INI NN 
M Ob O On oo 
COM No 
BOD a 
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Square Root 
12.689 
12.728 
12.767 
12.806 
12.845 


12.884 
12.923 
12.961 
13.000 
13.038 


13.077 
13.115 
13.153 
13.191 
13.229 


13.266 
13.304 
13.342 
13-379 
13.416 


13.454 
13-491 
13.528 
13-565 
13.601 


13.638 
13.675 
13.711 
13.748 
13.784 


13.820 
13.856 
13.892 
13.928 
13.964 


14.000 
14.036 
15.071 
14.107 
14.142 


14.177 
14.213 
14.248 
14.283 
14.318 


Number 
206 
207 
208 
209 
210 


211 
212 
213 
214 
215 


216 
217 
218 
219 
220 


Square 


Un Un tn Un Un 
[E POW OOOOH cuo 
On Ono DAHIN Aou BIG Un în 
È RIN Oh OON a 
ancen ORF © 


tn Un tn tn Un 
MM DOM 


Vn tn ta t 
OQ woo 
Ou On o 
NW DOO 
t OO. 


Square Root 
14-353 
14.387 
14.422 
14.457 
14.491 


14.526 
14.526 
14.595 
14.629 
14.663 


14.697 
14.731 
14.765 
14.799 
14.832 


14.866 
14.900 
14-933 
14.907 
15.000 


15.033 
15.067 
15.100 
15.133 
15.166 


15.199 
15.232 
15.204 
15.297 
15.339 


15.362 
15.395 
15.427 
15.460 
15.492 


15.524 
15.550 
15-588 
15.620 
15.652 


15.684 
15.716 
15.748 
15.780 
15.811 
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Number Square Square Root Number Square Square Root 
251 6 30 o1 15.843 296 8 76 16 17.205 
252 63504 15.875 297 8 82 09 17.234 
253 6 40 09 15.906 298 8 88 04 17.263 
254 6 45 16 15.937 299 8 94 o1 17.292 
255 6 50 25 15. 300 9 00 00 17.321 
256 6 55 36 16.000 301 9 06 or 17-349 
257 6 60 49 16.031 302 9 12 04 17.378 
258 6 65 64 16.062 303 9 18 o9 17.407 
259 6 70 81 16.093 304 9 24 16 17.436 
260 6 76 oo 16.125 305 9 30 25 17.464 
261 6 81 21 16.155 306 9 36 36 17:493 
262 6 86 44 16.186 307 9 42 49 17.521 
263 6 91 69 16.217 308 9 48 04 17.550 
264 6 96 96 16.248 309 9 54 81 17.578 
265 70225 16.279 310 9 61 00 17.607 
266 7097 56 16.310 311 96721 17.635 
267 7 12 89 16.340 312 9 73 44 17.664 
268 7 18 24 16.371 313 9 79 69 17.692 
269 72361 16.401 314 9 85 96 17.720 
270 7 29 00 16.432 315 9 92 25 17.748 
271 73441 16.462 316 9 98 56 17.776 
272 7 39 84 16.492 317 10 04 89 17.804 
273 7 45 29 16.523 318 10 II 24 17.833 
274 7 50 76 16.553 319 10 17 61 17.861 
275 75625 16.583 320 10 24 00 17.889 
276 7 61 76 16.613 321 10 30 41 17.916 
277 7 67 29 16.643 322 10 36 84 17.944 
278 7 72 84 16.673 323 10 43 29 17.972 
279 778 41 16.703 324 10 49 76 18.000 
280 7 84 00 16.733 325 10 56 25 18.028 
281 7 89 61 16.763 326 10 62 76 18.055 
282 7 95 24 16.793 327 10 69 29 18.083 
283 8 oo 89 16.823 328 10 75 84 18.111 
284 8 06 56 16.852 329 10 82 41 18.138 
285 8 12 25 16.882 330 1o 89 oo 18.166 
286 8 17 96 16.912 331 10 95 61 18.193 
287 8 23 69 16.941 332 11 02 24 18.221 
288 8 29 44 16.971 333 11 o8 89 18.248 
289 8 35 21 17.000 334 11 15 56 18.276 
290 8 41 00 17.029 335 11 22 25 18.303 
291 8 46 81 17.059 336 11 28 96 18.330 
292 8 52 64 17.088 337 11 35 69 18.358 
293 8 58 49 17.117 338 114244 18.385 
294 8 64 36 17.146 339 11 49 21 18.412 
295 8 70 25 17.176 340 11 56 oo 18.439 


Square 
11 62 81 
11 69 64 
11 76 49 
11 83 36 
11 go 25 


11 97 16 
12 04 09 
12 11 04 
12 18 o1 
12 25 00 


12 32 01 
12 39 04 
12 46 09 
12 53 16 
12 60 25 


12 67 36 
12 74 49 
12 i R 
12 88 81 
12 96 oo 


13 03 21 
13 10 

13 17 69 
13 24 96 
13 32 25 


13 39 i 
13 46 89 
13 54 24 
13 61 61 
13 69 00 


13 76 41 
13 83 84 
13 91 29 
13 98 76 
14 06 25 


14 1376 
14 21 29 
14 28 84 
14 36 41 
14 44 00 


14 51 61 
14 59 24 
14 66 89 
14 74 56 
14 82 25 
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Square Root 
18.466 
18.493 
18.520 
18.547 
18.574 


18.601 
18.628 
18.655 
18,682 
18.708 


18.735 
18.762 
18.788 
18.815 
18.841 


18.868 
18.894 
18.921 
18.947 
18.974 


19.000 
19.026 
19.053 
19.079 
19.105 


19.131 
19.157 
19.183 
19.209 
19.235 


19.261 
19.287 
19.313 
19.339 
19.363 


19.391 
19.416 
19.442 
19.468 
19.494 


19.519 
19.545 
19.570 
19.596 
19.621 


Number 
386 
387 
388 
389 
390 


391 
392 
393 
394 
395 


396 
397 
398 
399 


Square 
14 89 96 
14 97 69 
15 05 44 
15 13 21 
15 21 00 


15 28 81 
15 36 64 
15 44 49 
18 52 36 
15 60 25 


15 68 16 
15 76 09 
18 84 04 
15 92 O1 
16 oo oo 


16 o8 o1 
16 16 04 
16 24 09 
16 32 16 
16 40 25 


16 48 36 
16 56 49 
16 64 64 
16 72 81 
16 Sr oo 


16 89 21 
16 97 44 
17 05 69 
17 13 96 
17 22 25 


17 30 56 
17 38 89 
17 47 24 
17 55 61 
17 64 00 


17 72 41 
17 80 84 
17 89 29 
17 97 76 
18 06 25 


18 14 76 
18 23 29 
18 31 84 
18 40 41 
18 49 00 


Square Root 
19.647 
19.672 
19.698 
19.723 
19.748 


19.774 
19.799 
19.824 
19.849 
19.875 


19.900 
19 92 

19.640 
19.975 
20.000 


20.494 


20.518 
20.543 
20.567 
20.591 
20,616 


20.640 
20.664 
20.688 
20.712 
20.736 


| 
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Number 
431 
432 
433 
434 
435 


20 34 01 
20 43 04 
20 52 09 
20 61 16 
20 70 25 


20 79 36 
20 88 49 
a0 97 64 
21 06 81 
21 16 oo 


21 25 21 
ar 34 
21 43 
21 52 96 
21 62 25 


21 71 56 
21 8o 89 
21 90 24 
21 99 61 
22 09 00 


22 18 41 
22 27 

22 37 29 
22 46 76 
22 56 25 


Square Root 
20.761 
20.785 
20.809 
20.833 
20.857 


20.881 
20.90! 
20.9; 
20.952 
20.976 


21.000 
21.02. 
21.0. 
21.071 
21,098 


21.119 
21.142 
AL 506 
21.190 
21.213 


21.237 
21,260 
21.284 
21.307 
21.331 


21.354 
21.375 
21.401 
21.42. 
21.44! 


21.471 
21.494 
21.517 
21.541 
21.504 


21.587 
21.610 
21.633 
21.656 
21.679 


21.70: 
21.7 
21.749 
21.772 
21.704 


Number 
476 


Square 

22 65 76 
22 75 29 
22 84 84 
22 94 41 
23 04 00 


23 13 61 
23 23 24 
23 32 89 
23 42 56 
23 $2 25 


23 61 96 


25 1001 
25 20 04 


26 11 21 
26 21 

26 31 69 
26 41 96 
26 52 25 


Square Root 
21.817 
21 8, 
21.863 
21.886 
21.909 


21.932 
21.954 
21.977 
22.000 
22.023 


22.0, 
22 oti 
22 091 
22.11 
22.13! 


22.159 
e 
22.204 
22.226 


22249 


22.271 
22.293 
22.316 
22.338 
22.361 


22.383 
22.40! 
22.4 
22.450 
22.472 


22.494 
22.517 
22.539 
22.561 
22.583 


22.605 
22.627 
22.650 
22.672 
22.694 


Hs 
pE 
22. 

22.782 


nidos 
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Number 
521 
$22 
523 
524 
525 


526 
527 
528 
529 
530 


531 
532 
533 
534 
535 


536 
537 
538 
539 
540 


541 
542 
543 
544 
545 


546 
547 
548 
549 
550 


551 
552 
553 
554 
555 


556 
557 
558 
559 
560 


561 
562 
563 
504 
565 


Square 

27 1441 
27 24 84 
27 35 29 
27 45 76 
27 56 25 


27 66 76 
27 77 29 
27 87 84 
a7 98 41 
25 09 00 


28 19 61 
28 30 24 
28 40 89 
28 51 56 
28 62 25 


28 72 96 
28 83 69 
28 94 44 
29 05 21 
29 16 oo 


29 26 81 
29 37 64 
29 48 49 
29 59 36 
29 70 25 


29 81 16 
29 92 09 
30 03 04 
30 r4 o1 
30 25 00 


30 36 o1 
30 47 04 
30 58 o9 
30 69 16 
30 8o 25 


30 91 36 
31 02 49 
31 13 64 
31 24 81 
31 36 00 


31 47 21 
31 58 44 
31 69 69 
31 80 96 
31 92 25 
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Square Root 
22.825 
22.847 
22.869 
22,891 
22.913 


22.935 
22.956 
22.978 
23.000 
23.022 


23.043 
23.065 
23.087 
23.108 
23.130 


23.152 
23.173 
23.195 
23.216 
23.238 


23.259 
23.281 
23.302 
23.324 
23.345 


23.367 
23.388 
23.409 
23.431 
23.452 


23.473 
23.495 
23.516 
23.537 
23.558 


23.580 
23.601 
23.622 
23.643 
23.664. 


23.685 
23.707 
23.728 
23.740 
23.770 


Number 
566 
567 
568 
569 
570 


571 
572 
573 
574 
575 


576 
577 
578 
579 
580 


581 
582 
583 
584 
585 


586 
587 
588 
589 
590 


591 
592 
593 
594 
595 


596 
597 
598 
599 
600 


601 
602 
603 
604 
605 


606 
607 
608 
609 
610 


Square 

32 03 56 
32 14 89 
32 26 24 
32 37 61 
32 49 00 


32 60 41 
32 71 84 
32 83 29 
32 94 76 
33 06 25 


33 17 76 
33 29 29 
33 40 84 
33 52 41 
33 64 00 


33 75 61 
33 87 24 
33 98 89 
34 10 56 
34 22 25 


34 33 96 
34 45 69 
34 57 44 
34 69 21 
34 81 oo 


34 92 81 
35 04 64 
35 16 49 
35 28 36 
35 40 25 


35 52 16 
35 64 09 
35 76 04 
35 8801 
36 oo oo 


36 12 o1 
36 24 04 
36 36 09 
36 48 16 
36 60 25 


36 72 36 
36 84 49 
36 96 64 
37 08 81 
37 21 00 


Square Root 
23.791 
23.812 
23.833 
23.854 
23.875 


23.896 
23.917 
23.937 
23.958 
23.979 


24,000 
24,021 
24.041 
24,062 
24.053 


24.104 
24.125 
24.145 
24.166 
24.187 


24.207 
24.228 
24.249 
24.269 
24.290 


24.310. 
24.331 
24.352 
24.372 
24.393 


25.413 
24:434 
24.454 
24.474 
24.495 


24.515 
24.530 
24.556 
24.576 
24-597 


24.617 
24.037 
24.658 
24.678 
24.698 


= 


A 
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Number Square Square Root Number Square Square Root 
611 37 33 21 24.718 656 43 03 36 25.612 
612 374544 24-739 657 431649 25632 
613 375769 24759 658 4329604 25652 
614 376996 24779 659 434281 25671 
615 37 82 25 24.799 660 43 56 00 25.690 
616 37 94 56 24.819 661 43 69 21 25.710 
617 38 06 89 24.839 662 43 82 44 25.729 
618 38 19 24 24.860 663 43 95 69 25-749 
619 38 31 61 24.880 664 44 08 96 25.708 
620 38 44 00 24.900 665 44 22 25 25.788 
621 38 56 41 24.920 666 44 35 56 25.807 
622 38 68 84 24.940 667 44 48 89 25.826 
623 38 81 29 24.960 668 44 62 24 25.846 
624 38 93 76 24.980 669 44 75 61 25.856 
625 39 06 25 25.000 670 44 89 00 25.884 
626 39 18 76 25.020 671 45 02 41 25.904 
627 393129 25:040 672 45 15 84 25.923 
628 30 43 84 25.000 673 452929 25942 
629 39 56 41 25080 674 454276 25 962 
630 39 69 oo 25.100 675 45 56 25 25.981 
631 39 81 61 25.120 676 45 69 76 26.000 
632 39 94 24 25.140 677 45 83 20 26.01 
633 400089 25.150 678 45 06 84 26.03 
634 40 10 56 25.179 679 46 10 41 26.058 
635 40 32 25 25.199 680 46 24 00 26.077 
636 40 44 96 25.219 681 46 37 61 26.096 
637 40 57 69 25.239 682 46 51 24 26.115 
638 40 70 44 25.259 683 46 64 89 26.134 
639 40 83 21 25.278 684 46 78 56 26.153 
640 40 96 oo 25.298 685 46 92 25 26.173 
641 41 08 81 25.318 686 47 05 96 26.192 
642 41 21 64 25.338 68 47 19 69 26.211 
643 413449 25357 68 473344 26230 
644 414736 — 25377 689 47 47 21 Borm 
645 41 60 25 25.397 690 47 61 00 26.2 
646 41 73 16 25.417 691 47 74 81 26.287 
647 41 86 o9 25.436 692 47 88 64 26.306 
648 419904 25456 693 480249 26.325 
649 42 12 01 25.475 694 48 16 36 26.3 
650 422500 25.495 695 483025 26.303 
651 42 38 o1 25.515 696 . 4844 16 26.382 
652 42 51 04 25.534 697 48 58 09 26.401 
653 42 04 09 25.554 69 48 72 04 26.420 
654 427716 25.573 699 488601 26.439 


655 429028 25.593 700 490000 26.458 
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Number 
701 


Square 
49 1401 
49 28 04 
49 42 09 
49 56 16 
49 70 25 


49 84 36 
49 98 49 
50 12 64 
50 26 81 
50 41 00 


50 55 21 
50 69 44 
50 83 69 
50 97 96 
511225 


51 26 56 
51 40 89 
SI 55 24 
51 69 61 
51 84 00 


51 98 41 
52 12 84 
52 27 29 
52 41 76 
52 56 25 


52 70 76 
528529 
52 99 84 
5314 41 
53 29 00 


53 43 61 
53 58 24 
53 72 89 
53 87 56 
54 02 25 


54 16 96 
54 31 69 
5446 44 
54 61 21 
54 76 oo 


54 90 81 
5505 64 
55 20 49 
55 35 30 
55 50 25 
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Square Root 
26.476 
26.495 
26.514 
26.533 
26.552 


26.571 
26.589 
26.608 
26.627 
26.646 


26.665 
26.683 
26.702 
26.721 
26.739 


26.758 
26.777 
26.796 
26.814 
26.833 


26.851 
26.870 
26.889 
26.907 
26.926 


26.944 
26.963 
26.981 
27.000 
27.019 


27.037 
27.055 
27.074 
27-092 
27-111 


27-129 
27-148 
27.166 
27.185 
27.203 


27.221 
27-240 
27.258 
27.276 
27-295 


Number 
746 


Square 
55 65 16 
55 80 09 
55 95 04 
56 10 o1 
56 25 00 


56 40 o1 
56 55 04 
56 70 o9 
56 85 16 
57 0925 


57 15 36 
57 30 49 
57 45 64 
57 60 81 
57 76 oo 


57 91 21 
58 06 44 
58 21 69 
58 36 96 
58 52 25 


58 67 56 
58 82 89 
58 98 24 
59 13 61 
59 29 00 


59 44 41 
59 59 84 
59 75 29 
59 90 76 
60 06 25 


60 21 76 
60 37 29 
60 52 84 
60 68 41 
60 84 00 


60 99 61 
61 15 24 
61 30 89 
61 46 56 
61 62 25 


61 77 96 
61 93 69 
62 09 44 
62 25 21 
62 41 00 


Square Root 
27.313 
27.331 
27.350 
27.368 
27.386 


27.404 
27.423 
27.441 
27.459 
27.477 


27-495 
27.514 
27.532 
27.550 
27.568 


27.586 
27.604 
27.622 
27.641 
27.659 


27.677 
27.695 
27.713 
274731 
27.749 


27.767 
27-785 
27.803 
27.821 
27.839 


27.857 
27.875 
27.893 
27.911 
27.928 


27-946 
27-964 
27.982 
28.000 
28.018 


28.036 
28.054 
28.071 
28.089 
28.107 
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Number Square Square Root Number Square Square Root 
791 62 56 81 28.125 836 69 88 96 28.914 
792 62 72 64 28.142 837 70 05 69 28.931 
793 62 88 49 28.160 838 70 22 44 28.048 
794 630436 28.178 839 703921 28.905 
795 “ 632025 28.196 840 70 56 oo 28.983 
796 63 36 16 28.213 841 70 72 81 29.000 
797 63 52 09 28.231 842 70 89 64 29.017 
798 63 6804 28.249 843 710649 ^ 29.034 
799 63 84 or 28.267 844 71 23 36 29.052 
800 64 00 00 28.284 845 71 40 25 29.069 
801 64 16 o1 28.302 846 7157 16 29.086 
802 64 32 04 28.320 847 71 74 09 29.103 
803 64 48 09 28.337 848 71 OI 04 29.120 
804 64 64 16 28.355 849 72 08 o1 29.138 
805 64 80 25 28.373 850 72 25 00 29.155 
806 64 96 36 28.390 851 72 42 01 20.172 
807 65 12 49 28.408 852 72 59 04 29.189 
808 65 28 64 28.425 853 72 76 09 29.206 
809 65 44 81 28.443 854 72 93 16 29.223 
810 65 61 oo 28.460 855 73 10 25 29.240 
811 65 77 21 28.478 856 73 27 36 29.257 
812 659344 28.496 857 734449 29275 
813 66 o9 69 28.513 858 73 61 04 29.292 
814 66 25 96 28.531 859 73 78 81 29.309 
815 66 42 25 28.548 860 73 96 oo 29.326 

. 816 66 58 56 28.566 861 74 13 21 29.343 
817 66 74 89 28.583 862 74 30 44 29.360 
818 66 91 24 28.601 863 74 47 69 29.377 
819 67 07 61 28.618 864 74 64 96 29.394 
820 67 24 00 28.636 865 74 82 25 29.411 
821 67 40 41 28.653 866 74 99 56 29.428 
822 67 56 84 28.671 867 75 16 89 29.445 
823 67 73 29 28.688 868 75 34 24 29.462 
824 67 89 76 28.705 869 75 51 61 29.479 
825 68 06 25 28.723 870 75 69 00 29.496 
826 68 22 76 28.740 871 75 86 41 29.513 
827 68 39 29 28.758 872 76 o3 84 29.530 
828 685584 28.775 873 262129 29.547 
829 68 72 41 28.792 874 76 38 76 29.5603 
830 68 89 oo 28.810 875 76 56 25 29.580 
831 69 os 61 28.827 876 76 73 76 29.597 
832 69 22 24 28.844 877 76 91 29 29.614 
833 69 3889 28.862 878 770884 29.631 
834 60 55 56 28.879 879 77 26 41 29.648 
835 69 72 25 28.896 880 77 44 00 29.665 
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Number 
881 


Square 

77 61 61 
7779 24 
77 96 89 
78 14 56 
78 32 25 


78 49 96 
78 67 69 
78 85 44 
79 03 21 
79 21 00 


79 38 81 
79 56 64 
79 74 49 
79 92 36 
80 10 25 


8o 28 16 
8o 46 o9 
80 64 04 
80 82 or 
81 00 oo 


8r 18 or 
81 36 04 
81 54 09 
81 72 16 
81 9o 25 


82 08 36 
E 26 49 

2 44 64 
82 62 81 
82 81 oo 


82 99 21 
83 17 44 
83 35 69 
83 53 96 
83 72 25 


83 90 56 
84 08 89 
84 27 24 
84 45 61 
84 64 00 


84 82 41 
85 00 84 
85 19 29 
85 37 76 
85 56 25 


APPENDICES 
Number 


Square Root 
29.682 
29.689 
29.715 
29.732 
29.749 


29.766 
29.783 
29.799 
29.816 
29.833 


29.850 
29.866 
29.883 
29.900 
29.916 


29.933 
29.950 
29.967 
29.983 
30.000 


30.017 
30.033 
30.050 
30.067 
30.083 


30.100 
30.116 
10.133 
30.150 
30.166 


30.183 
39.199 
30.216 
30.232 
30.249 


30.265 
30.282 
30.299 
30.315 
39.332 


30.348 
30.364 
30.381 
30.397 
30.414 


926 
927 
928 
929 
930 


931 
932 
933 
934 
935 


936 
937 
938 
939 
940 


941 
942 
943 
944 
945 


Square 

85 74 76 
85 93 29 
86 11 84 
86 30 41 
86 49 oo 


86 67 61 
86 86 24. 
87 04 89 
87 23 56 
87 42 25 


87 60 96 
87°79 69 
87 98 44 
88 17 21 
88 36 oo 


88 54 81 
88 73 64 
88 92 49 
89 11 36 
89 30 25 


89 49 16 
89 68 o9 
89 87 o4 
go 06 of 
go 25 00 


90 44 OI 
90 63 04 
9o 82 o9 
91 or 16 
91 20 25 


91 39 36 
91 58 49 
91 77 64 
91 96 81 
92 16 oo 


92 35 21 
92 54 44 
92 73 69 
92 92 96 
93 12 25 


93 31 56 
93 so 89 
93 70 24 
93 89 61 
94 09 00 


Square Root 


30.430 
30.447 
30.463 
30.480 
30.496 


30.512 
30.529 
39.545 
30.561 
30.578 


30.594 
30.610 
30.627 
30.643 
30.659 


30.676 
30.692 
30.708 
30.725 
30.741 


30.757 
30.773 
30.790 
30.806 
30.822 


30.838 
30.854 
30.871 
30.887 
39.903 


30.919 
30.935 
30.952 
30.968 
30.984 


31.000 
31.016 
31.032 
31.048 
31.004 


31.081 
31.097 
31.113 
31.129 
31.145 
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Number 
971 
972 
973 
974 
975 


976 
977 
978 
979 
980 


981 
982 
983 
984 
985 


Square 
94 28 41 
94 47 84 
94 67 29 
94 86 76 
95 06 25 


95 25 76 
95 45 29 
95 64 84 
95 84 41 
96 04 00 


96 23 61 
96 43 24 
96 62 89 
96 82 56 
97 02 25 


Square Root 
31.161 
31.177 
31.193 
31.209 
31.225 


31.241 
31.257 
31.273 
31.289 
31.305 


31.321 
31.337 
31.353 
31.369 
31.385 


Number 
986 
987 
988 
989 
999 


991 
992 
993 
994 
995 


996 
997 
998 
999 
1000 


Square 

97 21 96 
97 41 69 
97 61 44 
97 81 21 
98 o1 oo 


98 20 81 
98 40 64 
98 60 49 
98 8o 36 
99 00 25 


99 20 16 
99 40 09 
99 60 04 
99 80 o1 
100 00 00 


Square Root 
31.401 
31.417 
31.432 
31.448 
31.464 


31.480 
31.496 
31.512. 
31.528 
31.544 


31.559 
31.575 
31.501 
31.607 
31.623 


APPENDIX VIII 


NOTE ON THE STANDARDIZATION OF 
MARKS 


N addition to the method of standardization given on p. 31, 
IE using standard deviation, two simpler, quicker but less 
accurate methods may be noted. 


The use of a five-point scale. 
The scores are arranged in order of merit and arranged in 
groups with the following percentages of cases. 
A B C D E 
Top 5% 25% 40% 25% Bottom 5% 
If marks are given to each question we could use the following 


put. E=1, D=2, C=3, B=4, A= 


Thus, the maximum mark is 5 x number of questions and the 
marks may be converted into percentages by multiplying by 
20 
number of questions 
about 60% and a fairly constant dispersion. 


2. The use of quartiles. 

A straight line graph i is plotted giving the actual raw scores at 
the three quartile points (first quartile, median and third quartile) 
and standard quartile scores of 40, 50 and 60 respectively. Such 
a method gives a standard deviation of about 15, which is 
convenient with a mean of 50 and extreme scores of o and 100, 
with very occasional scores of less than 5 and greater than 95. 


. This method gives an average mark of 


r 
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is recommended to read Professor Godfrey H. Thomson's 

Factorial Analysis of Human Ability, Second Edition. This 
admirably-written and impartial work not only gives a clear 
account of the ideas of various workers in this field in terms of 
fairly simple mathematics, but it does much to reconcile some of 
the apparently different ideas of the American authorities. 

The Measurement of Abilities by P. E. Vernon is the best work 
extant on the statistics of mental testing, marking and the ‘new’ 
examining. 

The Factors of the Mind by Sir Cyril Burt is an excellent work on 
the measurement of mental traits and it should be read in con- 
junction with Thomson’s book which we have mentioned above. 

The original research in educational matters which appears in 
The British Journal of Educational Psychology very often makes great 
use of statistical methods and in particular the analysis of variance, 
in recent issues. A new section of The British Journal which is 
devoted to statistical matters solely has made its appearance. 


F or an account of recent work in factorial analysis the student 


FAIRLY EASY WORKS 

Mental Tests. Ballard. University of London Press. 

Group Tests of Intelligence. Ballard. University of London Press, 

The Science of Marking. Thomas. Murray. 

Statistical Calculations for Beginners. Chambers. Cambridge 
University Press. 

How to Galculate a Correlation. Thomson, Harrap, 

A First Course in Statistics. Lindquist. Harrap. 

The Distribution and Relations of Educational Abilities. Burt. King. 

A Guide to Mental Testing. Cattell. University of London Press. 

The Selection of Children for Secondary Education. Davies and Jones. 
Harrap. 

Some Recent Work in Factorial Analysis and a Retrospect. Thomson. 
Harrap. >. 
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The Testing of Intelligence. Ed. Hamley. Evans. 

An Introduction to the Computation of Statistics. Dawson. University 
of London Press. 

Elementary Matrices. Turnbull and Aitken. Blackie. 

Intelligence, Concrete and Abstract. Alexander. British Journal of 
Psychology Monograph. 

An Examination of Examinations. Hartog and Rhodes. Macmillan. 

Statistics in Psychology and Education. Garrett. Longmans Green. 

The Reliability of Examinations. Valentine and Emmett. University 
of London Press. 

Essentials of Mental Measurement. Brown and Thomson. Cambridge 
University Press. 

Research in Education. Oliver. Allen & Unwin. 

Mental and Scholastic Tests. Burt. King. 

Elementary Statistics. Levy and Preidel. Nelson. 

Elements of Statistics. Bowley. Scribner. 


MODERATELY DIFFICULT WORKS 


The Measurement of Abilities.+ Vernon. University of London Press. 

The Factorial Analysis of Human Ability’ (Second Edition). Thom- 
son. University of London Press. 

The Factors of the Mind: (Second Edition). Burt. University of 
London Press. 

The Abilities of Man. Spearman. Macmillan. 

An Introduction to the Theory of Statistics? Yule and Kendall. Griffin. 

Statistical Methods. Snedecor. Iowa College. 

Statistical Method. Kelley. Macmillan. 

Statistical Procedures and their Mathematical Bases. Peters and Van 
Voorhis. McGraw-Hill. 

Statistical Methods for Research Workers.2 Fisher. Oliver & Boyd. 

Design of Experiments.2 Fisher. Oliver & Boyd. 

Methods of Statistical Analysis. Goulden. Wiley. 

The Vectors of Mind. Thurstone. University of Chicago Press. 

Primary Mental Abilities. Thurstone. University of Chicago Press. 

1'The first three works are of great importance to students of education and 


psychology. 
3 These books contain useful sets of statistical tables. 


BIBLIOGRAPHY 


Psychometric Methods. Guilford. McGraw-Hill. 

Tables for Statisticians and Biometricians. Pearson. Cambridge. 
The Methods of Statistics.» Tippett. Oxford. 

Statistical Tables. Fisher and Yates. Oliver & Boyd. 
Statistical Analysis in Educational Research. Lindquist. Harrap. 
Statistical Methods Applied to Education. Rugg. Houghton. 
Statistical Analysis in Biology. Mather. Methuen. 
Fundamentals of Statistics. Kelley. Harvard. 

The Advanced Theory of Statistics. Kendall. Lippincott. 

The Fundamentals of Statistics. Thurstone. Macmillan. 
Crossroads in the Mind of Man. Kelley. Stanford Univ. 
Probability, Statistics and Truth. Mises. Macmillan. 


1 "This contains an excellent explanation of analysis of variance, 
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AGE ALLOWANCE, 106-10 
Aitken, 126 

Alexander, W. P., 106, 130 
Alienation, 53, 133 

Allport, 129 

Anastasi, 129 

Arithmetic Mean, 13, 14, 16 
Ascendency-Submission Scale, 129 
Association, Yule’s coeff, of, 58 
Average Deviation, 24, 30 
Axes, 174 


BIMODAL CURVE, 12 

Binet, 116, 203 

Bipolar Components, 164 

Biserial correlation, 59 

Bravais, 41 

Brereton, 114 

Burt, v, viii, 127, 128, 146, 153,163, 171, 
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Butler, 75 


Ei 


CALCULUS, 176-80 
Cattell, R. B., 129 
Central tendency, 12 
Centroid method, 127 
Chi-squared, 75, 138-45 
Chronological Age, 96, 115, 116 
Colligation, 57 
Column diagram, 8 
Communality, 53, 123 
Compounding marks, 101, 102 
Contingency, 142-5 
Correction, Sheppard's, for grouping, 29 
Correlation, 41-8, 120, 200 

biserial, 59 

errors, 51, 77-82 

examples, 63-73 

partial, 54 

rank, 48 

ratio, 62, 154 

Spearman, 196 

spurious, 61 

tetrachor, 55-8 
Cosine, 129, 179-80 


* Covariance, 172 


Cumulative frequency, 7, 2° 
Curve-fitting, 96 


D (MEASURE OF VARIABILITY), 24, 30 

Data, 4 

Deciles, 18 

Degrees of Freedom, 84, 139-45, 150 
152, 159, 162 

Determination, 148 

Deviations, 23-32 

Differences, 48, 82, 83, 149, 196 

Differentiation, 176-9 

Distributions, 7-28, 87-97 


EDUCATION ACT, 1944, 105 
Educational age, 115 

Einstein, 2 

Elderton, 141, 144 

£ (epsilon), 179, 190 

Eta (correlation ratio), 62, 154 
Error (curve of), 9, 10, 11, 87-97) 199-5 
Errors, 75-83, 147 

Estimates, 51-2 

Examinations, 98-114 
Experiments, Design of, 146, 166 


F (MEUS M 151, 153, 154 159, 
105 

Factors, 119-38 

Fisher, R. A., 81, 84, 138, 141, 145, 146, 
166, 172 

Fitting Curve, 96 

Forecasting Efficiency, 51-2 

Frequency Distribution, 7-28, 190 

Frequency Polygon, 9 


g FACTOR, 120-7 

Gallup, 76 

Galton, 37, 146 

Garrett, 62 

Gaussian curve, 87, 190 
Goethe, 1 

Gosset, W. S. (‘Student’), 84 
Graeco-Latin Square, 1 
Graphs, 173-4 

Group factors, 128 
Guessing, correction for, 115 
Guilford, 129 


HARTOG, 114 
Heterogeneity, 86 
Hierarchical order, 121-3 


223 


224 


Histogram, 8, 189 
Holzinger, 15, 129 
Hotelling, 128, 131 
Hyperspace, 127, 131, 182 


INFLECTION, POINTS OF, 89, 193 

Integration, 176, 179, 193-4 

Intelligence quotient, 85, 115 

Intelligence Test, 5, 21, 32, 88, 92, 106, 
115-17 

Interaction, 166 

Interquartile Range, 23, 30, 76 


„k (COEFFICIENT OF ALIENATION), 53, 133 
Kelley, 15, 53, 205 
Kurtosis, 33, 191 


LAPLACE, 87 

Latin Square, 167-70 

Least Squares, 37, 117, 195, 200-1 
Leptokurtic curves, 33 

Loadings, 123-5 


MARKS, 13, 98-116 

Matrix, 120-6 

Maxima and Minima, 173, 179 

McCall, 32, 81 

Mean, 13-14, 22, 82, 151 

Measurement, Nature of, 1-6 

Median, 14, 15, 17, 23 

Mencius, 119 

Mendel, 188 

Mental age, 116 

Mental tests, 5, 21, 32, 88, 92, 106, 
115-17 

Minor Determinant, 123, 126 

Mode, 12, 23, 27 

Moray House Tests, 32, 106, 116 

Multiple correlation, 54, 132 

Multiple Factor Analysis, 125, 133 


NEW TYPE EXAMINATION, 114-15 
Norm, 117 

Normal Curve, 10, 87-97, 188-95 
Normal Curve, Tables, 91, 92, 95 
Normalized Scores, 31, 42 

Null Hypothesis, 136 


OBLIQUE FACTORS, 85, 130, 182 
Ogive, 7 

Order of determinant, 125 
Order of merit, 18, 48, 98 
Orthogonal factor, 130, 182 
‘Oval diagrams, 123 


* We, 
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INDEX 


PARTIAL CORRELATION, 54 
Pascal's Triangle, 188, 189 
Pearson, K., 41, 56, 58-9, 138, 146 
Percentiles, 18-22, 34 
Perseveration (5), 62 

Peters, 61, 81 

Physical measurement, 6, 87 
Piaggio, 134, 201 

Pivotal condensation, 126 
Platykurtic curve, 33 
Principal components, 130-1 
Probability, 75-81 

Probable Error, 24, 76-82 
Product-Moment, 41 
Prophecy-formula, 85 


QUARTILE DEVIATION, 23 
Quartiles, 18 


RANDOMIZATION, 166 

Rank (of a matrix), 126 

Ranks (correlation), 48, 67, 196 

Ratio (correlation), 62, 148 
(significance), 81, 149-50 
(variance), 151-4, 159, 165 

Regression, 37-40, 132, 200 

Regression Equation, 37-9, 132, 200 

Reliability of Tests, 84, 160, 163 

Replication, 166 

Rhodes, 114 

Rotation of Axes, 131 


$ FACTOR, 122-6, 128 
Sample (small), 84. 
Scatter-diagram, 36 
Semi-interquartile range, 23, 30 
Sheppard, 29, 57, 154 
Sigma, 13, 24-5, 25-31, 92-3 
Significance, 81, 137-8, 159-63 
Sine, 56, 180-3 
Skew curves, 11, 33-4 
Skewness, 33-4 
Slide-rule, 104, 183-6 
Snedecor, 172 
Sones, 59 
Spearman, 5, 85, 119, 123, 129, 130, 146 
Specificity, 123 
Squares and Square Roots, 25, 206-17 
Standard deviation, 24-30 
Standard error, 76-8, 82-4, 149, 156 
Standardization, 31-2, 117, 218 
Straight line, 37, 174-5 

t, 


* 
E. A 
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£ SCORES, 32 

‘p, Student's ratio, 84, 159, 152-6 

Tetrachoric correlation, 55-9 

Tetrad differences, 122-4, 133-5 

Thomson, G. H., 105, 118, 121, 125, 
127, 129 

Thurstone, L. L., 59, 127, 128, 129 

Tippett, 172 

Trigonometrical ratios, 180-1 

Turnbull, 126 

‘Two-factor theory, 120-8, 133-4 


Unu SIGNs (Tetrachor), 57 
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VALIDITY OF TESTS, 5, 85, 203-5 
Variability, 34 

Variance, 31, 54, 146-72 
Variance, Analysis of, 138, 146-72 
Vernon, P. E., 114, 116, 219 


®, WILL FACTOR, 62 

Webb, 62 

YULE, 57, 58, 150, 172 

2, STANDARDIZED SCORES, 31, 32 

zi, the hyperbolic arctangent of r, 81, 
194 


